monorepo
monorepo copied to clipboard
introduce importer/exporter APIs to replace "storage" plugins
This issue has been raised by @martin-lysk. We agreed that importers/exporters is the right way to go but decided at the Berlin Offsite in Oct 23 to work around the issue as long as possible. First users are confused now why they storage plugins are limiting them.
Problem
Inlang's set up to be "provide your storage plugin" leads to numerous issues:
- inlang's features are limited by the provided storage plugin (a no-go)
- users don't understand why certain features work or don't work with a provide storage plugin (e.g. https://github.com/inlang/monorepo/issues/1577#issuecomment-1791655712)
Proposal
Storing messages should be controlled by the SDK to ensure that inlang is not limited by external plugins.
-
loadMessages
should be succeeded byimportMessages
-
saveMessages
should be succeeded byexportMessages
message-29sn82
-> json import: login.button
-> paraglide export: login_button
-> ios export: LOGIN_BUTTON
-> android export: login-button
Pros
- inlang is not limited by external plugins
- import/export APIs can expose what features they support
- multi-platform exports (export for iOS, Android, Paraglide) become possible
- users are communicated what is supported by a target platform and what not.
- we can performance optimize the storage instead of naively calling
saveMessages()
andloadMessages()
- importer/exporter plugins could store additional data like message id
ieb3s
should be exported aslogin-button
, orieb2s
exists in the namespace fileen/login.json
. requires https://github.com/inlang/monorepo/discussions/1418
Cons
- effort.
- the introduction of https://github.com/inlang/monorepo/discussions/1418 should be completed with this change too. have a
project.inlang
folder to avoid massive scatter across a repo
Requirements
- [ ] allow multiple importers and exporters to be used in a project (load and save messages only support one "storage" plugin at a time)
- [ ] must allow for "namespacing" logic see https://github.com/inlang/monorepo/issues/1577#issuecomment-1791655712. otherwise existing projects can't migrate to inlang
- [ ] https://github.com/inlang/monorepo/issues/1769#issuecomment-1830270023
- [ ] when is export triggered? onSave [can be ignored for now]
- [ ] how to deal with creation/deletion
@martin-lysk this seems to be a great issue for you. after all, you raised this issue and now it is hurting our growth because users don't understand why feature limitations exist for different storage formats
As discussed in Berlin, this would be a API which finally solves limitations by external plugins and is therefore a good thing.
What (breaking) code change does this mean?
I try to get the whole picture - collecting the inputs from the tickets referenced it seems like you have a concept on how this should be integrated already. Before I make a proposal that might not meet your thoughts - Shall we have a kickoff about that issue @samuelstroschein?
Storing messages should be controlled by the SDK to ensure that inlang is not limited by external plugins.
- So inlang should come "with batteries included" and defines a default way to load messages and save messages? Shall plugins be able to override this behaviour at all?
loadMessages should be succeeded by importMessages saveMessages should be succeeded by exportMessages
So instead of loading and saving messages (persistance) plugins would import / export messages from sources like like sting.dict files or even api like poeditor/localize etc. The messages imported would than be managed by the inlang sdk and stored in the .inlang folder?
Storage Format: How would inlang store messages in the inlang folder?
- Are the files supposed to be edited by users directly like it is the case at the moment? (compare https://github.com/inlang/monorepo/discussions/1464#discussioncomment-7448011)
Yes. You can use any storage format you'd like. For example, the JSON storage plugin https://inlang.com/m/ig84ng0o/plugin-inlang-json, which also reduces the clutter of the inlang message format plugin/is it easier to write translations manually. This question is important for the question about the format we use
How do we store the data:
What format
- using the JSON encoded AST like in (https://inlang.com/m/reootnfj/plugin-inlang-messageFormat)
- using the message format schema from mf-wg https://github.com/unicode-org/message-format-wg/tree/main/spec/data-model
- other format
How do we split data
1. Store everything in one big json. All - messages with there locales/variants
Pro
- straight forward - just dump the whole AST json into a file like we do in https://inlang.com/m/reootnfj/plugin-inlang-messageFormat
- loading messages means just load one file - easy
- ...
Con
- pulling a change of a single message (done by another editor or push to repo) means fetching all messages with all variants and loading the whole file for now
- merge conflicts - two edits of different messages might lead to merge conflicts until lix understands the format
- loading messages means just load one file - if the project contains thousands of messages with dozends of languages and variants this might become a memory issue
- ...
2. Store Messages split by languages / split by namespaces Pro
- smaller files
- if separated by namespace one could load only a subset of messages by a given namespace without loading the whole file
- files could get handed over to translators by language/namespace
- devs are used to this kind of separation
Con
- motivated by current status quo not by the needs of a storage format
- leads to manual edits of the files that might not be wanted at all
- if a problem occures in one message (for example a user edited the file manually - or a tool had an error with encoding) it breaks the whole format - to reverse this user has to look into huge files
My thoughts: This is how those files are often stored in the old world to be able to deliver only messages on websites or to load messages in memory only for the current language (ios language bundles). Since inlang is usually interested in a message as a whole - all its properties (languages / variants) splitting it this way doesn't make sense for the storage format.
The use case for translators should be managed by import/export plugins instead.
3. Store each message with its locales and variants in a separate file
Pro
- each message is an atomic entity - on fs level already. as long as you don't edit variants or languages of a message at the same time you don't have to deal with merge conflicts (even without lix - semantic meaning) even if we have simultanes edits a last write wins approach woulnd't hurt to much
- versioning comes out of the box by git's version history
- updates on a messages get propagated via the file watcher
- the format could be checked against the message format schema (not really an argument - nothing prevents us to use the schema in our own schema that wraps the messages with a map)
- loading a subset of keys in large project's could be done by filtering filenames
- the api we design around this format is more likely to be similiar to the one we have when lix can store it as a whole
- if a problem occures in one message (for example a user edited the file manually - or a tool had an error with encoding) it breaks the whole format - to reverse this user has to look into huge files
- ...
Cons
- initial load would need to load all files in a folder - thousands of fs.read's instead of one if the project contains thousands of messages
- git might become slow (compare https://www.monperrus.net/martin/one-million-files-on-git-and-github)
- we add a lot of file to the inlang folder - people using inlang may not like that
- ...
hey @martin-lysk
- The "inlang directory" change should happen before the importer/exporter stuff https://github.com/inlang/monorepo/discussions/1418.
- I have yet to write a proposal for the directory stuff. Planned to do that early next week.
- The directory proposal will make changes to the message format really easy so we don't have to worry about one big json or not
When would you start with this/the to be proposed directory change so that I know when to write my proposal ?
I guess the planned iterations in https://github.com/inlang/monorepo/issues/1459#issuecomment-1801904162 will keep me busy this week - I could have a look next week tuesday.
@martin-lysk okay i intend to publish more "inlang directory" proposal monday/tuesday which I would give you to implement. afterwards, the import/export stuff can be handled
@janfjohannes
@samuelstroschein @martin-lysk @janfjohannes What's the status here? Who is leading the importer/exporter & should be assigned?
@felixhaeberle see samuels comment: https://github.com/inlang/monorepo/issues/1585#issuecomment-1802405547
@felixhaeberle my directory implementaiton will come first. progress can be tracked in the https://github.com/inlang/monorepo/tree/1678-project-directory branch
Another take on the splitting proposals of martin:
While working with @NiklasBuchfink on the editor/sdk watcher we saw that not having granularity of messages is a nightmare. We watch files with thousands of messages. But we can only lint per message so as you can imagine that leads to a lot of reactive work. If we would have a solution like proposal 3 we can watch for messages and then only update and lint one message. That would be a huge advantage.
I see the problem like Samuel said, that we could have problem with a lot of files then. If we can build a granular watcher that works with one file but watches on every message could solve the problem to, by shifting complexity to the watcher.
-> Being able to watch for only one message should be a requirement for the SDK improvements.
We need granular reactivity per pattern that changes. Updating a single pattern should only have the linting for that particular pattern as a side effect. A reactivity matrix is needed where we can observe changes in individual patterns and apply CRUD operations. A mappable watcher that acts like a proxy over the files would be the dream. Basically, we need the same approach for file watching as SolidJS has for its reactivity system. Avoid diffing by creating a pub/sub pattern for each small entity that is reactive. That's why the normal watcher is the bottleneck or we have to split everything into a thousand files, which has its own limitations as described in suggestion 3.
@NiklasBuchfink i created https://github.com/inlang/monorepo/issues/1817. Let's keep this issue for importer/exporter only.
@samuelstroschein whats your take on the scope here? shall we just add importers exporters to inlang sdk and keep the storage topic completely out of the scope of this issue? If so we would still need a plugin that provides load and save method right?
Open questions:
What will become the execution points for importers / exporters - when shall we trigger an import or an export?
Only If storage is part of the ticket (if not we can answer those later):
- should the change be backward compatible / should plugins loadMessage and saveMessages marked as deprecated as part of this
- since we plan to reimplement the save logic and it will most likely need a migration for existing projects i would iterate on the target persistance format first
Thoughts on import/export triggers:
Compared to the current setup that only accepts one load and one save - importers and exporters will coexists. Projects may have an ios exporter and an android exporter and a json exporter all configured in one project. Compared to save and load, where save triggered on each change and load that is executed by the watcher and initially exports/imports are usually triggered externally and not by events coming from the sdk:
Use cases for Export Plugins
- triggers when configured within a ci pipeline as part of the cli
- a button within the editor to trigger an export
- a button in the editor that a allows a developert to download the localization files
- ... any hooks in the sdk that an exporter should be triggered on?
Use cases for Import Plugins
- a cli command that allows to import all keys from an existing set of messages - like ios/android/....
- an external webhook (like one from lokalise https://developers.lokalise.com/docs/webhook-events#projectkeymodified that updates the keys) intagration - this would most likely only be a trigger to an import
- an upload of an file like ios strings file within the editor
- ... do you see any triggering of an import other than external ones?
@martin-lysk do you think a sync call where we draft spec in google docs is quicker than github back and forth? If so, let's schedule a call
Issue #1844 will be completed before this one.
This proposal is a reaction to https://github.com/inlang/monorepo/issues/1844#issuecomment-1860894998
Proposal - introduction of aliases via amap
Introduce an alias map plugins can use to establish a relationship between message id and exported/imported key name.
Pros*
- plugins can make use import/export keys
- apps can display and edit "alias for i18next is login.button" because the plugin id is known
cons
- ?
const message = {
id: "human_airplane_globe",
+ alias: {
+ plugin.inlang.i18next: "login.button",
+ plugin.inlang.xml: "login-button"
+ }
}