documentation icon indicating copy to clipboard operation
documentation copied to clipboard

Integrate support for translated content

Open jakubcech opened this issue 5 years ago • 10 comments

Description

We'd like to have support for content in multiple languages. The content will be added by Crowdin PRs.

The site needs to support:

  1. (UX) Language switching (by a switcher in the UI)
  2. Different folder structure as the files will be added to the documentation repo by PRs:
    • Add en as an extra folder for the current content
    • Add the content for the other languages in the respective folders es, de, fr, ... what other languages? cn?
  3. Build pipeline and script needs to change as it will need to look for the .MD files in a slightly different path.
  4. We need to make sure the supportive files, like doc-index, home.md and other to be updated as well. Probably by us?
    • Do we translate all the content to all the languages at first? That means that the landing page on different languages would have different landing pages, different navigation and so on.
    • Where exactly in the path should we put the language part? If it's <project>/0.1/<language>/<content> then things could be set up differently than if it's <language><project>/0.1/<content> and so on.
    • Do we keep the file names for the content exactly the same and only use the folder structure to distinguish things? That would probably make maintaining the support files easier, but there might have to be support added to their resolving with regards to the path and currently selected version.

jakubcech avatar May 03 '19 07:05 jakubcech

What do we do about translating images ?

obany avatar May 03 '19 07:05 obany

Hmm, let's start with not translating them? Then we can move on with translating the important ones for the most utilized languages. That's at least what MS does in their docs.

jakubcech avatar May 03 '19 08:05 jakubcech

I agree with Jakub. We can start with content. Then, we'll need to think of a strategy for images. Some images will be easy (screenshots in Trinity). Others will be more difficult as the text often expands when translated.

French, Russian, Chinese, German, and Spanish are our most popular languages. Can the site support Cyrillic script or Chinese characters?

Could we add a discrete badge to each page to let readers know that we are crowdsourcing translations? https://crowdin.com/project/iota-documentation/settings#badges

JakeSCahill avatar May 03 '19 08:05 JakeSCahill

It'd be great to translate the landing page and the navigation too. Because these are pulled in from markdown files, we could have home_es.md....

For the structure, <project>/0.1/<language>/<content> might be easier. Then, we can keep the images folder outside of language for now.

JakeSCahill avatar May 03 '19 08:05 JakeSCahill

Depending on the final folder structure, we can ask translators to update any link paths. For example: blueprints/0.1/en/introduction/overview.md becomes blueprints/0.1/es/introduction/overview.md

Then, we could have: blueprints/0.1/images/ blueprints/0.1/en/ blueprints/0.1/es/ blueprints/0.1/home.md blueprints/0.1/home_es.md ...

JakeSCahill avatar May 03 '19 08:05 JakeSCahill

Spell checker supports only Russian and English: https://github.com/hcodes/yaspeller We could probably skip the spell checker on translations though. This will be done as part of QA on Crowdin

JakeSCahill avatar May 03 '19 08:05 JakeSCahill

It should have no problems with other character sets, they are all unicode.

The most common browser languages from 1st Jan this year were:

Language %
en-us 41.12%
de-de 12.73%
en-gb 7.37%
de 4.04%
zh-cn 3.59%
es-es 3.56%
zh-tw 2.20%
ru-ru 2.09%
pt-br 2.05%
it-it 1.89%
fr-fr 1.80%
nl-nl 1.70%
en 1.17%
ko-kr 1.16%
tr-tr 1.04%
de-ch 0.68%
pl-pl 0.65%
de-at 0.56%
en-ca 0.53%
en-in 0.53%

The language should probably go at the very top level to keep each language from polluting the content of the others. We would only include content that has been translated, and fallback to en for any that are missing (same applies to images, indexes, home etc). That way you could translate just a single file and everything else would still work, makes it a more granular process for the translators.

To be able to switch between languages the folder/files/structure must be kept the same, otherwise the app won't know where to navigate to in the other language. This also means translators would not need to worry about the structure.

obany avatar May 03 '19 08:05 obany

We don't use yaspell anymore, we use https://github.com/atom/node-spellchecker which supports your native spell checker.

obany avatar May 03 '19 08:05 obany

For Crowdin configuration, we can do: /%locale%/**/%original_file_name%

This should create a top-level language folder and then replicate the same file/folder structure as the source en folder.

To import the content into Crowdin, we can do /**/*.md, OR if we put everything in an en folder, we can do /en/**/*.md.

JakeSCahill avatar May 03 '19 08:05 JakeSCahill

For images, we can translate the "title" tag like we do for Accessibility for developers with disabilities. I don't think we added this yet

NelsonPython avatar May 03 '19 21:05 NelsonPython