release-toolkit icon indicating copy to clipboard operation
release-toolkit copied to clipboard

Support Bulk Download from GlotPress

Open AliSoftware opened this issue 2 years ago • 1 comments

Warning: This is a Work in Progress branch. I've only commited this to the remote and opened this draft PR to upload and keep a backup of my experiment while I refine the implementation, but this is still early stage of the implementation

🚧 Next Steps / TODO

  • [ ] Fix rubocop violations
  • [ ] Implement robust Unit Tests for all the new classes and stages
  • [ ] Implement the fastlane actions that will use those new implementations
  • [ ] Undraft this PR

Why?

With us encountering the 429 Too Many Requests quota limitation more and more often while downloading translations from GlotPress during our release duties, I've recently asked our Team Global and Orbit (ref: pxLjZ-72D-p2) to install the gp-import-export plugin on our GlotPress instances (ref: 307-gh-Automattic/Orbit), so that we can bulk-download the translations for all the locales at once (as a ZIP).

This will allow us to only make a single network request to export them instead of one request per locale (which ends up being a lot of requests especially for WordPress and the 40+ locales it supports)

How?

I've reimplemented the logic for the download of GlotPress exports into a more modern infrastructure / code architecture in our code base. I intend to deprecate glotpress_helper and gp_downloadmetadata_action once this new implementation is validated as more stable, and refactoring android_download_translations_action / ios_downlaod_strings_files_from_glotpress to use the new helpers and models (or deprecating those actions and creating new ones to replace them).

The new implementation/architecture is structured like this:

Models to manipulate locales easily

This will supersede my #296 draft PR (which I opened… one year ago!)

  • models/locale: is a Struct which represents a single locale, exposing the various locale code representations for that locale — since there are multiple ISO standards for locale codes needed/used in various places of the codebase (ios folder names, android folder names, glotpress locale codes…)
  • models/locales: represents a set of Locale objects. That model allows to build a set of locales based on their locale codes (to have an object representing all the locales supported by a project/app, typically), build convenience sets (e.g. Mag16), easily add/remove/filter locales from sets, and find a known locale from the set of all known locales

A class dedicated to download exports from GlotPress

The helper/glotpress_downloader class knows how to exports from GlotPress in android, ios or json formats.

  • You start by initializing an instance with a GlotPress URL host and URL project path

  • You can then call methods to download exports from that project:

    • Either for a single, given locale, one by one
    • Or — provided the GlotPress host has the gp-import-export plugin installed and supports it — as a bulk download of all the locales at once (as a ZIP stream that will be decompressed by the downloader class on the fly)
  • That class also knows how to parse the odd format used by GlotPress for its JSON exports (where the JSON keys are actually the copy key + \0004 + the original copy, and the JSON values are an array with mostly only one entry at most) and convert it to a usable Hash.

This class is supposed to only have the knowledge of the GlotPress side of things. Once it downloaded the export from GlotPress in memory (File/IO instance) and parsed it, it is not responsible for doing anything with the data except handing it over to the next piece in that new architecture (typically a *_file_writer).
That way, if we want to support alternative Localization backends in the future (e.g. OneSky or Weblate, see 298-gh-Automattic/i18n-issues), this would be nicely decoupled.

File writers — to write the downloaded translations for each locale to disk

  • helper/ios/ios_strings_file_writer knows how to take a single .strings file export, for a single locale, and write it to disk(†).
  • helper/android/android_strings_file_writer knows how to take a single strings.xml file export, for a single locale, and write it to disk(†).
  • helper/fastlane_metadata_file_writer knows how to take a Hash of exported translations (key => translation) for a single locale, and write it as .txt files to disk (typically in fastlane/metadata/…/{locale}/ folders)
    • This class includes a subclass MetadataRule which describes the rules to apply to each key, especially its max_len and the name/path of the corresponding .txt file to associate with it.
    • The FastlaneMetadataFileWrite then knows how to use a list of MetadataRule objects to extract the proper keys from the translations, check their length, try the shorter alternatives (entries for #{original_key}_short, by convention), and write the one that fits the limit into the proper .txt file (or delete the file if no fitting translation was found).
    • The class provides predefined sets of MetadataRule lists for Android and iOS projects, with predefined keys for typical App Store / Play Store metadata.
    • The writer still offers a way to dynamically provide a MetadataRule for any unknown key, allowing us to describe how to handle unexpected/unusual keys (e.g. screenshots) as we encounter them.

(†) Those two *_strings_file_writer helpers are currently very simple, just taking the IO stream and writing it to disk verbatim, with the only logic being to know the relative path to use for the subfolder based on the locale passed. But in the future I'm hoping we can improve those classes to optimize the way files are written to disk, especially reorder the XML nodes in strings.xml files to make sure they are always sorted in the same order, making PR diffs for them easier to review. This is not implemented in this PR yet though.

To Test

🚧 There is currently no unit tests / specs. This is the next step on my TODO list. In the meantime, I've written some "demo code" of how those helpers are supposed to interact with each other and assembled together. This demo code is likely the scaffold of what the future new fastlane actions (the ones using this new implementaion) will look like, but for now they are quite drafty and only serve the purpose of brain-dumping my idea of how that new API of mine would work together at call site.

AliSoftware avatar Aug 16 '22 20:08 AliSoftware

cc @mokagio in case you're curious (squirrel!) about this early implementation and my vision/attempt for the new code modular architecture for all those GlotPress-export-related code (see PR description) 😃


To be clear, this is still WIP and not fully ready for review yet (I'll add you as reviewer once I got unit tests and it's ready); though the new architecture and its separation of concerns used for the new classes (separating the download part, the post-process part, and the writing part, to be ready to potentially support a platform other than GlotPress one day) is already where I want it to be.

But since I'm not sure when I will have time to come back on this code/project in the near future (hence why I decided to push it as draft anyway, to avoid losing my work so far) to finish it, I figured that, knowing you, you might be interested/curious to take a pick at my brain dump and architecture idea here 😉

AliSoftware avatar Aug 16 '22 21:08 AliSoftware