identity-idp icon indicating copy to clipboard operation
identity-idp copied to clipboard

Flattens .yml files for i18n

Open zachmargolis opened this issue 1 year ago • 4 comments
trafficstars

What's in the Branch

Based on the challenges of merging multi-line, nested YML files for translation, this PR proposes flattening them to files with flat keys, no indentation needed.

  • scripts/yml_to_flat_yml takes a batch of yml files and converts them to txt
  • I18nFlatYmlBackend implements loading these files for the I18n gem

Next Steps

If we wanted to land this, we'd need to:

  • [x] update our normalize-yaml JS script to have an option to parse/fix these files as well
  • [x] update our use of i18n-tasks (need to make a compatible parser for it, it doesn't reuse i18n ones 🙄 )

Example

Example of how to generate one of these .txt files:

find config/locales -type f -name '*es.yml' | xargs ./scripts/yml_to_txt > config/locales/es.txt

Format specification

Uses JSON to encode one-line forms of strings. If we need newlines inside strings, we'll use \n

string.key.with.parts: "JSON string version of value"

To support arrays, I went with "if all keys of a hash are numeric, it should be an array" (we have one case of a hash with mixed numbers and strings)

Alternative to handle arrays

Another approach could be a special symbol for arrays like:

a.b.c.#0: "first item"
a.b.c.#1: "second item"

zachmargolis avatar Apr 24 '24 22:04 zachmargolis

I expect the identity-rails-i18n-webpack-plugin JavaScript package will need some updates here as well, since it's currently implemented to read the YAML files.

aduth avatar Apr 25 '24 13:04 aduth

It seems like we don't have that many strings that include newlines. I wonder if the "values" here could just be raw text to avoid having to deal with JSON encoding and make them easier to edit by hand. This would encourage splitting large blocks up into multiple strings and handling formatting concerns in the views instead.

Alternately, raw text with \n supported for a newline would probably be fine, since it's unlikely the actual text would ever contain that? You could even allow a literal \n via \\n or something. What I'm saying is we probably don't have to be perfect here and that maybe the ergonomics of a simpler editing format outweigh the need to allow strings that include the literal text "\n".

matthinz avatar Apr 25 '24 21:04 matthinz

It seems like we don't have that many strings that include newlines. I wonder if the "values" here could just be raw text to avoid having to deal with JSON encoding and make them easier to edit by hand. This would encourage splitting large blocks up into multiple strings and handling formatting concerns in the views instead.

Alternately, raw text with \n supported for a newline would probably be fine, since it's unlikely the actual text would ever contain that? You could even allow a literal \n via \\n or something. What I'm saying is we probably don't have to be perfect here and that maybe the ergonomics of a simpler editing format outweigh the need to allow strings that include the literal text "\n".

After some messing around, it turns out that this format is valid YAML, so I switched it around and tools like prettier can work with it. I pulled the telephony files back out into their own separate YMLs since those do have explicit newlines and are a bit easier

zachmargolis avatar Apr 25 '24 21:04 zachmargolis

Wanted to post to add, I have been merging main into this branch a few times, and scripted the work needed to "true-up" with main:

./scripts/yml_fix_merge_conflicts --force
make normalize_yaml

The scripts merges new keys into the combined/flattened yml files, so it accepts whatever updated keys are on main

zachmargolis avatar May 03 '24 20:05 zachmargolis

i18n_spec.rb includes a flatten_hash helper which I assume will be unnecessary after these changes. Could save for a follow-on pull request to reduce the scope here.

aduth avatar May 07 '24 13:05 aduth