lute-v3 icon indicating copy to clipboard operation
lute-v3 copied to clipboard

Update all book stats on background thread so % known etc is more accurate

Open patrickayoup opened this issue 1 year ago • 7 comments

Description

When reading one book and marking words as known. The % known of other books that contain those words does not change unless that book is actually clicked into and viewed.

To Reproduce

Steps to reproduce the behavior, e.g.:

  1. Read a book. Mark some words as known.
  2. Return to the main index.
  3. Observe that the % known words of other books with the same words that you marked known doesn't change.
  4. View another book that has the same words you marked known.
  5. Return to the index and observe that now the % known is updated.

** Implementation notes **

This should be a lightweight implementation to update book stats -- Lute users shouldn't have to run a worker (e.g. with Celery or similar), that's too much overhead. Instead, the calculations could be done on a background thread.

  • https://vmois.dev/python-flask-background-thread/ - notes about threading; https://stackoverflow.com/questions/40989671/background-tasks-in-flask - a decorator for a simple async job.
  • calculation might get slow if the user has many books open.
  • When user closes a book and returns to the home page, the stats should be updated as soon as possible ... perhaps stats can get pre-calculated for following pages if users update a given page's terms
  • possible optimization (premature?): if a book's stats have been updated on date X, and the user has added/modified terms since that date, Lute only has to check the sampled pages for the stats calc to see if they contain any of those updated terms. If not, no update is required

patrickayoup avatar Jan 21 '24 01:01 patrickayoup

This is unfortunate but intentional. The stats calc for books is quite slow. Updating all books results in a very long lag when you go to the Home Screen.

I think there is another issue about speeding up the stats already. Will check later and if there is I’ll close this issue. Thanks!

jzohrab avatar Jan 21 '24 02:01 jzohrab

It might be interesting to compute this asynchronously to not block the page... but I understand the reasoning.

Thanks

patrickayoup avatar Jan 22 '24 23:01 patrickayoup

I started investigating the stats calc and resolved some, but not all, of the time lag. Your request still isn't feasible with the existing slow-ish code, but hopefully we can do the stats calcs in a background thread as you mentioned. That's the right way to do it.

I'll repurpose this issue (editing the title and the description) to updating the stats on a background thread. Thanks, cheers!

jzohrab avatar Jan 23 '24 15:01 jzohrab

Hi @patrickayoup , @webofpies has added a nice feature in the develop branch where the stats can be updated clicking a small icon in the header. That will be in the next release. In the meantime, a background thread would still be nice, so this issue is still open. Cheers!

jzohrab avatar Jan 28 '24 14:01 jzohrab

That sounds like a great addition, thanks @webofpies !

patrickayoup avatar Jan 30 '24 03:01 patrickayoup

Library to try for background threads:

https://viniciuschiele.github.io/flask-apscheduler/ https://apscheduler.readthedocs.io/en/3.x/

jzohrab avatar Feb 08 '24 16:02 jzohrab

I have used apscheduler before in a past project, can confirm it works pretty well and is rather lightweight.

patrickayoup avatar Feb 08 '24 17:02 patrickayoup