Add "export unknown terms" (or "export all terms and statuses") action to Book actions
~~Blocked by #316~~ - this is done now
The parent mapping export used to have a thing to export all unknown terms. That could be useful for loading up vocab lists for books.
The code has some TODO issue_336_export_unknown_book_terms markers for things that should be used for this.
- add action
- add unit test (or restore from existing) -- note that books now add status 0 terms while reading, have to handle those.
- add integration test (or restore from existing)
UPDATE: Lute has a CLI job to export book terms -- see the comment below for notes about what's needed to make this a book action callable from the UI.
As part of this work, any code with TODO issue_336_export_unknown_book_terms should be removed, as I don't think it's used anymore.
No longer blocked.
This is slightly more complicated than the hacky code marked with the TODO, or the language_term_export.py thing.
The current hacky code doesn't include multiword terms. For languages like classical chinese, that's important.
I think that what needs to happen is an in-memory "render" of each page, something like read.service.start_reading -- but without saving all of the status 0 terms. The resulting paragraphs will contain all of the text tokens, including net new ones (not saved) and saved status 0 ones, and all the rest, of course.
The test cases for this are pretty easy, even if the code isn't:
- new book = all words
- new book with some known words
- new book with some multi-word terms
- new book with some status 0 terms
- extraneous status 0 words not included
- at the start and end of each test run, the number of terms saved in the db should not increase, book current text id shouldn't change
Since this is long-running, may need to have some kind of WebSockets to report back to the client.
Some good interim progress. Hacked at the language term export job quite a lot, and added a new book_term_export <bookid> <filename> cli job, e.g.:
flask --app lute.app_factory cli book_term_export 432 sp_terms.csv
This is a bit slower than the old job, b/c it essentially does the calculations for a full page render for each page. It feels like it should be faster, but whatever.
This can't be added to the "actions" dropdown, b/c it doesn't communicate well back to the client. The job just prints to the command line, but when clicked from the web ui the job should really communicate back via a web socket, and then download the file at the end. Since the job is slow-ish, the user should be notified what's happening.