Supports Dynamic Language Loading for Smaller Download Size
Coming from here: https://github.com/freedomofpress/dangerzone/issues/460#issuecomment-1652008914
I think it would also be a good idea to support dynamic font loading for languages like this, for example, Thai is not included in CJK fonts set. So if we were to bundle the fonts, it will increase the image size by a lot. And currently any Thai languages from a document file will result in tofu.
Thanks for reporting this. Assessing the current language support state and implementing a solution like dynamic loading is something that we plan to do.
Regarding specifically Thai, do you know what would be needed to add its support? never mind. I see now that you explained this on our other discussion. Pasting it here for future reference:
the Noto Thai font here: https://github.com/notofonts/thai.
Increased the scope of this issue to consider the wider problem of Language support. For a language to be fully supported we need both the fonts on the "doc to pixels" part (for proper rendering) as well as OCR models in the "pixels to PDF" part.
The goal will be to find a scalable way to achieve this. OCR models add inflate the container image size by a lot. If we could trim that down and only download them on a "per-need" basis that would be perfect.
Offline-mode
The other day the idea of an offline-ready version of Dangerzone surfaced, where all of these models would be pre-downloaded already.
For comparison, here's a breakdown of the application sizes:
| Language Content | container.tar.gz size |
Dangerzone.app size |
|---|---|---|
| current (0.4.2) | 900M | 963.6M |
| only eng OCR model | 452M | 578.5M |
| only eng OCR model and no CJK fonts | 381M | 503.4M |
Some notes about things to keep in mind about this on Qubes:
- it won't work on offline qubes where the client is run
- the default user's
/homesize is 5GiB. Similarly to what happened on the 0.8.0 release, there may be a need to update the install instructions to increase the size of the client qube by 5GiB or more.