ros2_documentation
ros2_documentation copied to clipboard
Multi-language support
Hi all,
I would like to push the multi-language support for the documentation, letting to have the official ROS 2 documentation in Spanish, Portuguese, Japanese, French, and so on. I am convinced that language is an entry barrier for many potential ROS users. I am aware, at least in the local Spanish group, that many efforts are focused on translating documents. I wonder if we could support it from this repo.
I have seen that the documentation is in https://docs.ros.org/en/, so it is reasonable to have a https://docs.ros.org/es/, https://docs.ros.org/pt/, https://docs.ros.org/fr/..., isn't it? We could initially fill the non-English versions with the English version, and let people, even non-technical people, contribute with their translations.
Do you think it is reasonable? Is it technically viable?
Thanks Francisco
Unless a very small subset of the documentation is translated (e.g., installation instructions and some basic tutorials), I think that the translations would quickly become unmanageable and outdated or would just generally lag behind.
To complement the subset of manually-translated pages, we could perhaps rely on automatic translations. I just looked at Google's translation of the main Ubuntu installation instructions in French. It's not great (and of course doesn't translate text in images and some text in code blocks), but it's not terrible either.
Yes, we would need some technical resources to track outdated pages. It might be enough to check the timestamp and make an automatic issue that a revision on that page is required.
I don't know if this repo should contain the docs in other languages. ROS has always followed a federated development model. Maybe interested local user groups in specific languages could maintain a fork of this repo, in which they would start translating the documents. Each local group would be responsible for updating the documents.
Maybe we could move this discussion to ROS Discourse. I will make a post there...
I have pushed a little here. Trying to resolve what @christophebedard , I have developed https://github.com/fmrico/sync-docs, a GitHub action that:
- It is used in a forked repo of sphinx-based documentation. For example https://github.com/ROS-Spanish-Users-Group/ros2_documentation and https://github.com/ROS-Spanish-Users-Group/PlanSys2.github.io
- Automatically, daily, it merges any change in upstream to the forked repo.
- If a translated page is outdated by a change in the upstream, CI fails, and it creates an issue with the info
The action is in the testing stage, but it could be helpful for forked repos for language translations.
This issue has been mentioned on ROS Discourse. There might be relevant details there:
https://discourse.ros.org/t/ros-2-documentation-in-other-languages/28811/4
IMO, I believe this is gonna be good for ROS local community, I will second this approach.
Unless a very small subset of the documentation is translated (e.g., installation instructions and some basic tutorials), I think that the translations would quickly become unmanageable and outdated or would just generally lag behind.
this is true. I am skeptical to have multiple language support by mainline. probably local community based support would be better.
I don't know if this repo should contain the docs in other languages.
probably not, reason is the same with above.
but if we take multiple language support in this repo, i would request the following architecture dependency.
- mainline doc WILL NOT depend on any multiple language contents.
- Only multiple language contents can refer to mainline doc.
Agree completely, @fujitatomoya.
This repo (English version) is the reference. Any local group should maintain a fork in their organization/user. I have provided a GitHub action to help keep the forked repos.
We only would need help generating and linking the doc in https://docs.ros.org/*/
This issue has been mentioned on ROS Discourse. There might be relevant details there:
https://discourse.ros.org/t/llamada-para-traductores-de-la-documentacion-oficial-de-ros-2/28882/1
Hi @fmrico, (edited) I have been quickly trying to follow the Sphinx multilingual support and it provides a short set of files for each language. This is within the rolling branch of ros2_documentation and rolling branch
cd sources
ln -s ../conf.py .
ln -s ../favicon.ico .
sphinx-build -b gettext . ../build/gettext
The output in build/gettext is
(ros2_doc) ➜ gettext git:(rolling) ✗ ls
Citations.pot Contact.pot How-To-Guides.pot Related-Projects.pot The-ROS2-Project.pot index.pot
Concepts.pot Glossary.pot Installation.pot Releases.pot Tutorials.pot
Each string has an identifier, it seems very easy to maintain.
The only caveats is that the any modification on the documentation upstream will modify the msgid (which might be seen as a feature).
What is your thought on this approach ?
We could even maybe use https://github.com/SekouD/potranslator to generate a first translation...
Hi @olivier-stasse
This Christmas, some local user groups have implemented a more straightforward solution: maintain a fork with the assistance of GitHub actions to synchronize changes with respect upstream. So, these groups are in charge of maintain the translations and avoid increase complexity in upstream. We have to finish discussing how to generate and link these repos in the ROS official documentation web page.
IMHO the solution that you propose is a big change in the structure and work done in this repo. Let's see how the current approach works before changing the course.
Let's continue discussing this here, or in TSC.
Best
Hi @fmrico, Sorry for my lack of precision, we also have created a local French group with a fork here: https://github.com/ROS-French-Users-Group/ros2_documentation following more or less the Spanish group organization.
We also are trying to use the github action to synchronize changes with respect upstream.
We have been evaluating the work to do. And after reading the Sphinx documentation I am wondering if the forked group could be organized using the sphinx po files.
IMHO it was more interesting to share the discussion here rather than in the fork.
Best.
Hi, a quick update on the technical/user experiment on one technical solution I mentioned previously to answer @christophebedard comments on automatic translation. (edited)
Brief conclusion
The burden of the technical solution justify only if you have a significant set of non technical users willing to help in translation. Otherwise the solution suggested by @fmrico is probably more efficient for people used to github.
Detailed explanations
All the tests were done on the French fork.
The sphinx multilingual support is working the following way. First you need to generate intermediate pot files through:
sphinx-build -b gettext . _build/gettext
From this pot files it is possible to have po files for a specific language. In my case I tried French:
sphinx-intl update -p _build/gettext -l fr
On my local branch this generated:
locale
└── fr
└── LC_MESSAGES
└── index.po
The po files are using the reference language (here en) as an identifier. For instance for Related-Projects.rst you have:
#: ../../source/Related-Projects.rst:3
msgid "Related Projects"
msgstr ""
From this point the Sphinx documentation gives two choices, but there is an additional one answering @christophebedard question on automatic translation. Sphinx choices are either change manually the po file or use Transifex. The po file manual modification is obvious but does not add anything to @fmrico's solution. The second choice makes only sense if there is a strong base of non technical users. Unfortunately the call on the French local user group did not bring that many volunteers.
The third solution is to use an automatic tool such as the one provided by the following rep: https://github.com/SekouD/potranslator (N.B. the author has a new github account, but the pip install of the package points towards this repo). It is two years without any update, and relies on google_translate. I had to update the google_translate python package in the python virtual env to a newer version, but it worked in a rather unsatisfactory way. Indeed the connection with google service was very slow and broke often. I managed to have a translation of files, but had to run the potranslator for each directory. Therefore an automatic translation will need some effort in robustify this tool. From Stack over flow threads it looks like that google breaks from time to time this API, thus I am not sure this is worth it. In addition the translation needs to be checked with some reformulations. It can be a good starting point so, and this is the main reason why I went through all the way down to this lengthy comment.
But I am not sure this is a valid alternatives to @fmrico proposal for a github-action.
This issue has been mentioned on ROS Discourse. There might be relevant details there:
https://discourse.ros.org/t/documentation-en-francais/28896/14
Hi @olivier-stasse
Thanks for pushing this evaluating more alternatives. My concerns about using po files come from losing the direct connection between the original documentation in English and our translated pages. It is easier to maintain and does force anybody to make any modifications to the official documentation.
Translating this is a best-effort task to help people who prefer to read in their language, or it is not able (or it isn't easy) to read in English. A translated page is good, but in any other case, the page will be in English.
As you said, it is challenging to enroll volunteers in this. Maybe today we have volunteers, and tomorrow we don't. In any case, I assume that they are technicians, and git shouldn't be a barrier.
Hi @fmrico, just to clarify one point, the po file can be created from the official documentation without modifying it. The official documentation is itself the string-to-be-translated identifier. The newly created files DO NOT HAVE to be included in the main repo. So you do not loose the reference to the official documentation and do not add anything to the main repo.
It just adds one layer of complexity, and thus is justified only if one wants to use efficient third-party tools.
Again thanks for launching this initiative, this is an interesting exercise.
Thanks @olivier-stasse
Let me dive into your work about po files. Maybe we shouldn't discard it.
Hi @clalancette
I want to retake this thread after the TSC discussion.
We have translated 57/258 (22%) of the documentation in the Spanish version, and it is online in the GitHub page URL. Probably, @olivier-stasse has also progressed in this task. I think we can start thinking about how to link it to the official documentation. I see two options:
- (preferred) When the official documentation is built from the official repo and linked under https://docs.ros.org/en/, do exactly the same for a list of translated repos, and link them under https://docs.ros.org/es/, https://docs.ros.org/fr/, etc...
- Delegate the generation and URL to the Local Users Groups in their repos, as is currently, but add a section in the official documentation with links to them. This could be done pretty fast.
Maybe we could start with 2 while 1 is implemented. What do you think? Francisco
Hi, Some follow up on the automation of the translation.
Brief feedback
I succeeded in applying google_translate to the whole ros2_documentation using potranslator3 which is a fork of potranslator3. The result is here: https://ros-french-users-group.github.io/ros2_documentation/
In overall the google translation is a very good starting point. I finally came back to this solution because modifying the rst file is rather tedious and for the few interested volunteers this is overwhelming.
Technical description
The overall process
As specified in the sphinxdoc international documentation you need to:
- Modify the
conf.pyfile to specify the language and specify the locale_dir. - Generate
.potfiles from.rstfiles usingmake gettext. The resulting files are inbuild/gettextby default - Generate
.pofiles from.potfiles for a specific language. In my case it wasfr. The.pofiles havemsgidandmsgstrfields. - Translate the
.pofiles - Build the
.mofiles - Build the
htmlfiles.
The fr example.
- The
conf.pyfile was modified by switching the variablelanguageto the locale languagefrand the following lines were added:
locale_dirs = ['locales/'] #path is an example but this is the recommended path.
gettext_compact = False #optional.
- Generate
pot.fileswith:
make gettext
which generates the whole directory in build/gettext
3. Generate translated po.file using google_translate.
For this, potranslator3 was used :
potranslator update -d source/locales -l fr -p build/gettext
This generates the .po files in : ./source/locales/fr/LC_MESSAGES/
following the same architecture than the documentation.
4. The next step is to generate the .mo files. It was done using a script:
#!/bin/zsh
for afile in ./**/*.po(.)
do
targetfile=${afile:r}.mo
echo "msgfmt $afile -o $targetfile"
msgfmt $afile -o $targetfile
done
- To build the html file:
make html
Based on the default language variable in conf.py and the mo.files co-located in the source/locales/fr/LC_MESSAGES/*.po files the system generates the translated documentation.
7. Copy the generated html files in the github branch of the local user group.
What remains to be done ?
- Change some wrong translation. It can be done by proposed PR in the translated
.pofiles in the branch where they are located. - Check if the
msgidmodifications in the reference repository, i.e.ros2/ros2_documentationcan be detected, and possibly offer a first translation. - Automatize the process with a github action.