omnivore
omnivore copied to clipboard
detect language from html content
Tested with different libraries and this seems pretty fast and accurate: https://github.com/dachev/node-cld It also supports HTML content
The latest updates on your projects. Learn more about Vercel for Git ↗︎
Name | Status | Preview | Comments | Updated (UTC) |
---|---|---|---|---|
omnivore-demo | ✅ Ready (Inspect) | Visit Preview | 💬 Add feedback | Nov 27, 2023 3:18pm |
omnivore-prod | ✅ Ready (Inspect) | Visit Preview | 💬 Add feedback | Nov 27, 2023 3:18pm |
I think we'd be best off if we stored the code like en
instead of English
. I know in the past we stored the full name, but I bet more tools would work with the codes.
@jacksonh Should we store ISO 639-1 or 639-3 language code?
We probably should also store a mapping of language code to ISO language name in the backend for searching. A simple dropdown list of language selector could be added to the UI as well
I think probably we should use ISO 639-3 to distinguish languages like Simplified and Traditional Chinese