lute-v3
lute-v3 copied to clipboard
Update all book stats on background thread so % known etc is more accurate
Description
When reading one book and marking words as known. The % known of other books that contain those words does not change unless that book is actually clicked into and viewed.
To Reproduce
Steps to reproduce the behavior, e.g.:
- Read a book. Mark some words as known.
- Return to the main index.
- Observe that the % known words of other books with the same words that you marked known doesn't change.
- View another book that has the same words you marked known.
- Return to the index and observe that now the % known is updated.
** Implementation notes **
This should be a lightweight implementation to update book stats -- Lute users shouldn't have to run a worker (e.g. with Celery or similar), that's too much overhead. Instead, the calculations could be done on a background thread.
- https://vmois.dev/python-flask-background-thread/ - notes about threading; https://stackoverflow.com/questions/40989671/background-tasks-in-flask - a decorator for a simple async job.
- calculation might get slow if the user has many books open.
- When user closes a book and returns to the home page, the stats should be updated as soon as possible ... perhaps stats can get pre-calculated for following pages if users update a given page's terms
- possible optimization (premature?): if a book's stats have been updated on date X, and the user has added/modified terms since that date, Lute only has to check the sampled pages for the stats calc to see if they contain any of those updated terms. If not, no update is required
This is unfortunate but intentional. The stats calc for books is quite slow. Updating all books results in a very long lag when you go to the Home Screen.
I think there is another issue about speeding up the stats already. Will check later and if there is I’ll close this issue. Thanks!
It might be interesting to compute this asynchronously to not block the page... but I understand the reasoning.
Thanks
I started investigating the stats calc and resolved some, but not all, of the time lag. Your request still isn't feasible with the existing slow-ish code, but hopefully we can do the stats calcs in a background thread as you mentioned. That's the right way to do it.
I'll repurpose this issue (editing the title and the description) to updating the stats on a background thread. Thanks, cheers!
Hi @patrickayoup , @webofpies has added a nice feature in the develop branch where the stats can be updated clicking a small icon in the header. That will be in the next release. In the meantime, a background thread would still be nice, so this issue is still open. Cheers!
That sounds like a great addition, thanks @webofpies !
Library to try for background threads:
https://viniciuschiele.github.io/flask-apscheduler/ https://apscheduler.readthedocs.io/en/3.x/
I have used apscheduler before in a past project, can confirm it works pretty well and is rather lightweight.