wagtail-localize Fragmentation of rich-text fields in editor for translated pages

Possibly related: #253

The current behaviour of rich-text blocks in the translation editing interface is to display every block-level element (p, img, li etc) in its own translation field, so that individual fragments can be updated in the source language without retranslating the whole page, and so that images don't need translating.

This is feedback from a client who is using wagtail-localize.

In some cases, translators will add or remove information that is managed through the use of block-level elements in the code (<li> is the main example, but <p> is one too), just because on commercial pages, arguments are not the same depending on the target, or we want to explain things in a whole different way to users from a given country or speaking a given language.

I think the implication is that a 1:1 relationship between block-level elements is an unnecessary optimisation, because of the subtleties of translation. With the current arrangement, it's hard to see how translators could rewrite a 3-element list in English ("we offer these services in the UK…") as a 5-element list in French ("we offer even more services in France…") and a single paragraph in Japanese ("sorry, these services are only available in Europe").

If users want to retain some of the benefits listed above (editing and retranslating fragments only) they would be welcome to break the page up into smaller rich-text fragments manually.

Nov 23 '20 10:11 nimasmi

This would be really difficult to implement, due to the way stream/rich text fields work, it's not simple to have both the ability to translate each paragraph/list item individually but also allow adding additional blocks. I think this requires us to rewrite streamfield/rich text interfaces specifically for translation.

The solution we provide for this at the moment is the "End translation" button in the action menu. You can use Wagtail Localize to make a translated copy of the content of the original page, then click the "End translation" button in the action menu to switch it to the regular Wagtail editor for further changes.

I'll leave this open in case we can ever think of a solution to this, but this probably isn't going to happen for a while.

Nov 24 '20 09:11 kaedroho

I think I may have not been clear.

it's not simple to have both the ability to translate each paragraph/list item individually

I submit that this is not a necessary feature, and might be preemptive optimisation. Translators of static documents (manuals, magazines, news articles) are used to working with longer blocks of text, but particularly text in context. Creating an individual translation input for a page title, a menu entry, or a heading in the HTML furniture is one thing, but a list item in large block of text has a role in the context of that larger block of text, and it seems reasonable to translate it so.

I'm suggesting that we ditch the splitting of a RichText field into its component block-level elements, and present it in the translation interface as a complete rich text field, images and all, i.e.

drop the 1:1 relationship between block-level elements on the source page and the translated page (where each paragraph/li/table becomes a separate input)
implement a 1:1 relationship between RichText fields instead (where each field, as a collection of paragraphs, images, ul elements etc. gets a single larger input)

Or we make the behaviour optional.

This seems to me a simpler option to code than what is done at the moment. Would the structure of the code prevent this? Is there an issue with presenting a RichText field in the translation UI?

Nov 24 '20 10:11 nimasmi

There are a few reasons why I went with per-paragraph instead of per-field:

Rich text fields can be quite big, and this can make the experience of translating them using a PO file or Pontoon harder
If a single paragraph is edited, you don't need to retranslate the whole field again, just the paragraph that changed
Translators don't need to care about structural elements (like <p>, <h1>) or untranslatable tags (<img>, <embed>, which have Wagtail-specific attributes).

I thought that having the blocks in order as they are on the page and having the preview feature would provide enough context for translators, but maybe I was wrong to assume that? I agree with your other ticket that we could improve the styling so it's easier to see what's a list item and whats a paragraph, etc.

I have a long-term goal to merge rich text block with Streamfield, so you can get a field that has a rich text editor UI but underneath the data is represented in JSON similar to Streamfield. Doing this would make what you're asking for much more difficult to implement, but it would make the idea of allowing extra Streamfield blocks to be added into the translation work for rich text as well.

Nov 24 '20 10:11 kaedroho

The solution we provide for this at the moment is the "End translation" button in the action menu. You can use Wagtail Localize to make a translated copy of the content of the original page, then click the "End translation" button in the action menu to switch it to the regular Wagtail editor for further changes.

There's a catch with this approach, currently the editor needs to publish the page so that the translations made in the "translation" mode appear in the "non-translation" mode. A button to copy all changes made in the "translation" mask to the normal page content will help with this generating a workable approach where the editor can start in the "translation" mode with all the assistance provided in that mode and then copy the draft to the non-sync mode and finalize the page in any way imaginable.

Oct 12 '23 14:10 vladox

wagtail-localize wagtail-localize copied to clipboard

Fragmentation of rich-text fields in editor for translated pages

wagtail-localize
wagtail-localize copied to clipboard