flask-msearch
flask-msearch copied to clipboard
Update index too slow
I use whoosh backend with jieba analyzer. And use sqlalchemy link to mysql.
I use flask-apscheduler to perform update whoosh index tasks regularly. Code like blow:
@scheduler.task('cron', id='do_refresh_whoosh_index', hour=2)
def refresh_whoosh_index():
app = scheduler.app
with app.app_context():
search.update_index()
The first time to update index is very quick( no index file yet), thousands rows per second. But when update_index run again, it will be very slow, about just 10 rows per second. I have to delete the index, and recreate.
Is there another solution? Or I made some mistake?
Thanks!
flask-msearch would update index automatically after rows have been created or updated, you shouldn't do it manually, search.update_index() always update all rows ranther than new rows.
If you want to update all index manually, you should disable MSEARCH_ENABLE, and increase the size of yield_per
Because I update some rows of my database table in other application per day, I have to update the index manually per day. ( Is that right? database update in other applications, flask_msearch can not update index automatically.)
Now the problem is that delete_index() and update_index() too slow. And increasing yield_per not work, because the cpu limit.
So, I have to delete all index file manually and create index again. create_index() is thousands times faster than update_index() and delete_index().
Is there a problem with the algorithm?
Thanks!