icu4x
icu4x copied to clipboard
Add BaseLanguageHandling option to Datagen
See https://github.com/unicode-org/icu4x/issues/58
I implemented the new option as BaseLanguageHandling::Retain or BaseLanguageHandling::Strip.
Current:
- Runtime
- RuntimeManual
- Hybrid
- Preresolved
Proposed to add:
- RuntimeWithBaseLanguages
- RuntimeManualWithBaseLanguages
- @sffc - How to we turn off regional variants?
- @robertbastian - Maybe something like
en-XXto not include regional variants, this woulud be more granular because you could dode,en-ZZfor alldevariants and noenvariants - @sffc - I think it should be a variant, because it can also apply to regional variants like
en-001. So you may wanten-001-nochilds, which includesenanden-001but noten-GB.
- @robertbastian - Maybe we want something like
Deduplicated / Runtime2 {
use_internal_fallback: bool,
include_base_languages: bool,
}
Now:
#[non_exhaustive]
pub enum FallbackMode {
#[default]
DefaultForProvider,
// same as Deduplicated { RuntimeFallbackLocation::Internal, BaseLanguageHandling::Strip }
Runtime,
// same as Deduplicated { RuntimeFallbackLocation::External, BaseLanguageHandling::Strip }
RuntimeManual,
Hybrid,
Preresolved,
Deduplicated(Deduplicated),
}
2.0:
#[non_exhaustive]
pub enum LocaleExpansionMode {
Deduplicated(Deduplicated),
Exhaustive(Exhaustive),
Preresolved(Preresolved),
}
#[non_exhaustive]
pub enum RuntimeFallbackLocation {
Internal,
External,
}
#[non_exhaustive]
pub enum BaseLanguageHandling {
#[default]
Retain,
Strip,
}
#[non_exhaustive]
#[derive(Default)]
pub struct Deduplicated {
runtime_fallback_location: Option<RuntimeFallbackLocation>,
base_language_handling: BaseLanguageHandling,
}
// resolve this against the exporter like
foo.runtime_fallback_location.unwrap_or_else(||
if exporter.supports_runtime_fallback() {
RuntimeFallbackLocation::Internal
} else {
RuntimeFallbackLocation::External
}
)
#[non_exhaustive]
pub struct Exhaustive {}
#[non_exhaustive]
pub struct Preresolved {}
LGTM: @Manishearth @sffc @robertbastian
@robertbastian What do you want the CLI to look like in 1.5 and in 2.0?
I guess for now we flatten the enum like runtime-manual-retain and runtime-strip?
If we are going to end up with a base_language_handling option in the Deduplicated enum, then maybe it makes sense that it should be its own flag on the CLI.
It doesn't, because then someone will set base_language_handling for hybrid or preresolved and that doesn't make sense.
We have a lot of flags that are only used if some other flag is set. This would be the same.
@robertbastian I would like to merge this as-is because:
- The final 2.0 CLI option should indeed be named
--base-language-handlingbecause that is how we handle all other cases where flags are not used depending on the value of other flags. If you really want I could make--base-language-handlingreturn an error if set with the wrong value for--fallback. - In the API, we are changing it anyway in 2.0 as discussed above. The change proposed is the smallest change to the API. I don't want to implement only half of the 2.0 changes now. I would rather we, in another PR, implement LocaleExpansionMode as designed above. We could even do that in 1.5 and deprecate FallbackMode. In that PR, we could delete any unreleased APIs such as this one that are covered by LocaleExpansionMode.
With my engine maintainer hat on, I have to agree with @sffc. Boa will require all engine embedders to use this feature, so from a docs perspective it's easier to guide our users to activate a simple flag than to explain to them which fallback variants are supported and how to activate them.
Need to re-do this on top of #4710
Probably super seeded by #4836.
Probably super seeded by #4836.
Things this PR does that #4836 didn't do:
- Added to CHANGELOG.md
- Added "tlh-001" to the options test (to test an unsupported non-base locale)
- Made retaining base languages the default option