hyphenation
hyphenation copied to clipboard
Embedded all language in lib make it too fat
Although we can edit build.rs manually, change the
let langs = vec![
"af",
"hy",
...
"hsb",
"cy"
];
to
let langs = vec!["en"];
to make the final lib much smaller (8.65 MB -> 132 KB). I think its nice to have a option in [dependencies.hyphenation] to set what language to be embedded.
Maybe something like this:
[dependencies.hyphenation]
version = "0.6.0"
features = ["nfd"]
language = ["en-us"]
Dictionary embedding was already going to be under a feature flag starting with the next release, and adding individual language flags is certainly an idea worth considering. (It would have to be flags, because the Cargo manifest format and Rust cfg
system are not flexible enough to allow as nice a syntax as language = ["en_us"]
for library features.) It will probably happen soon, but not immediately..
maybe https://crates.io/crates/inflate and https://crates.io/crates/deflate also helps.
use deflate::deflate_bytes;
let data = b"Some data";
let compressed = deflate_bytes(data);
compress US-en lang 132kb to 20kb……
Starting with v0.8, embedding all dictionaries should take no more than 2.8MB. Moreover, the feature embed_en-us
has been introduced for the common case of embedding American English in e.g. a small utility.
I would still like to find a better solution; ideally, one which allows end-users to select languages individually without a feature explosion.