Here I will talk about some problems with RTL Languages and their solutions. I will explain all the points here, and we can have a discussion about it. And maybe add a section that talking about these problems & solution in Guidelines in CONTRIBUTING

The base discussion on this issue starts on this PR https://github.com/EbookFoundation/free-programming-books/pull/6706 and https://github.com/EbookFoundation/free-programming-books/pull/6715

What is the issue ?

If we have an RTL text here

* [تعلم البرمجة](URL) - Author Name

Note : تعلم البرمجة means that Learn Programming

It will appear on the website like this:

In this case, we can just dir="rtl"

<div dir="rtl">

* [تعلم البرمجة](URL) - Author Name
</div>

Result:

Is that it ?, No! The monster will show up below 😢

Mixing RTL with LRT languages issue !

The real problem is that when mixing RTL with LRT languages

Case 1

<div dir="rtl">

* [تعلم HTML](URL) - Author Name
</div>

Note : تعلم means that Learn.

Result:

Look, he put words in the mixer!

Case 2

If we need to make LTR to go right (both author name and title are LTR)

<div dir="rtl">

* [Learn HTML](URL) - Author Name
</div>

Result:

Both words have been swapped!!

Solution ?

We can solve these two problems with Unicode mark called RLM: https://en.wikipedia.org/wiki/Right-to-left_mark

By adding &rlm; after the LRT word that we need to mark it as RTL (it will pretend as RTL word)

Solve case 1

<div dir="rtl">

* [تعلم HTML&rlm;](URL) - Author Name
</div>

Result:

We added &rlm; after HTML

Solve case 2

<div dir="rtl">

* [Learn HTML&rlm;](URL) - Author Name
</div>

Result:

You get the point!

Extra Cases!

Case 1

Try to make C# go to right!

<div dir="rtl">

* C#
* [تعلم لغة C# الرائعة](URL) - إسم المؤلف
</div>

Note: * [تعلم لغة C# الرائعة](URL) - إسم المؤلف means * [Learn the Cool C# Language] (URL) - Author Name

Result:

The Symbols have the same problem when we try to RTL it And it has the same solution 😉, by LRM Unicode mark: https://en.wikipedia.org/wiki/Left-to-right_mark

<div dir="rtl">

* C#&lrm;
* [تعلم لغة C#&lrm; الرائعة](URL) - إسم المؤلف
</div>

We use &lrm not &rlm, why? The issue with the symbol is that when we try to add a RTL attribute to C# to make it get to right It will render as a RTL word, so the symbol will reorder to the other side.

By adding &lrm; after the C# we mark it as LTR word, so it will render as LTR word

Case 1.1

Both Author Name and Title are LTR and end with a symbol as C#

<div dir="rtl">

* [Learn C#](URL) - Author Name
</div>

Result:

The first here will be simple, just put &rlm; at the end of the title

<div dir="rtl">


* [Learn C#&rlm;](URL) - Author Name
</div>

Result:

But note that the symbol # renders as a RTL word, so it will reorder to the other side. so we must use &lrm; after this symbol.

<div dir="rtl">

* [Learn C#&lrm;&rlm;](URL) - Author Name
</div>

Result:

Case 2

If the Title in English and the Author Name in Arabic

* [Learn HTML](URL) - إسم المؤلف

Result:

It is enough to make the direction be RTL only without putting any Unicode mark

<div dir="rtl">

* [Learn HTML](URL) - إسم المؤلف
</div>

Result:

Case 3

Sometimes we add some information like (:construction: *in process*) after the author name

<div dir="rtl">

* [عنوان بالعربي](URL) - Author Name (meta data)
* [Title In LTR&rlm;](URL) - Author Name (meta data)
</div>

Result:

It seems like it is correct, but we read from right to left, so it would be nice if this information was in left to read the author name first then the information

So to solve this, we just put &rlm; after the name

<div dir="rtl">

* [عنوان بالعربي](URL) - Author Name&rlm; (meta data)
* [Title In LTR&rlm;](URL) - Author Name&rlm; (meta data)
</div>

Result:

Feb 10 '22 22:02 AhmedElTabarani

if we set a section talking about this solution in Guidelines in CONTRIBUTING (after we finish discussing it here of course)

Other contributors can do the same with their own RTL languages

Feb 10 '22 22:02 AhmedElTabarani

Thanks for adding this. We can leave it open for a while.

Feb 11 '22 02:02 eshellman

As commented in #6715 if this marks, HTML entity or unicode raw character breaks alphabetize plugin, even worst when are placed at the begining of sentence (the reason: see https://github.com/vhf/remark-lint-alphabetize-lists/blob/ee5f968040acf941c9c4d61fefb2bb1e3b1e8a7b/lib/alphabetical-list-items.js#L5-L14)

From Windows11 charmap.exe

Moreover, non printable version should be used instead of HTML entity. Remember that Markdown markup should be HTML agnostic

Feb 11 '22 21:02 davorpa

@davorpa i can make regex patterns for all these cases It that will help you to detect it automatically or something like that in future ?

Sep 21 '22 15:09 AhmedElTabarani

@davorpa i can make regex patterns for all these cases It that will help you to detect it automatically or something like that in future ?

Go ahead :wink:. It can be helpful to any maintainer :heart:

Sep 21 '22 15:09 davorpa

@AhmedElTabarani Hello sir, can I work on this?

Oct 15 '22 04:10 Mayank7225

@AhmedElTabarani Hello sir, can I work on this?

About regex putterns ? Ok no problems at all

I was working on it but i was very busy this weeks.

I was decided to make a JavaScript script to detect all of these and some unit tests to make everything organized

This is last thing I ended up with, maybe it will help you.

Case 0 (It is enough to make a div with dir='rtl')
* [تعلم البرمجة](URL) - Author Name
Regex:
^\* \[[^\w\d\?><;,\{\}\[\]\-_\+=!@\#\$%^&\*\|\']+\]\(.+\) - .+(?<!\(.+\))$


Case 1
* [تعلم HTML](URL) - Author Name
Regex:
^\* \[[\u04c7-\u0591\u05D0-\u05EA\u05F0-\u05F4\u0600-\u06FF-\u0621-\u064A\d\?><;,\{\}\[\]\-_\+=!@\#\$%^&\*\|\' ]+[\w\d]+\]\(.+\) - [\w\ ]+$

Case 2
* [Learn HTML](URL) - Author Name
Regex:
^\* \[[^\u04c7-\u0591\u05D0-\u05EA\u05F0-\u05F4\u0600-\u06FF-\u0621-\u064A]+[\w\d]\]\(.+\) - [\w\ ]+$

Extra Case 1
* C#
* [تعلم لغة C# الرائعة](URL) - إسم المؤلف


Extra Case 1.1
* [Learn C#](URL) - Author Name


Extra Case 2 (It is enough to make a div with dir='rtl')
* [Learn HTML](URL) - إسم المؤلف

Extra case 3
* [عنوان بالعربي](URL) - Author Name (meta data)
* [Title In LTR&rlm;](URL) - Author Name (meta data)

Oct 15 '22 06:10 AhmedElTabarani

The main RTL languages are Arabic, Persian and Hebrew... which are only 3 out of all the languages translated on this repo... might be better to have a special section for these languages... as it is not relevant for all the LTR ones.

Oct 20 '22 12:10 avipars

Have you tried the following?

Update the CONTRIBUTING.md file to include a section for RTL languages, explaining the issues, solutions, and usage of Unicode marks (RLM and LRM) for different cases.
Create a separate section or a separate file specifically for Arabic, Persian, and Hebrew languages in the repository, as @avipars suggested. This would help maintain a better organization for RTL languages and make it easier to manage content for these languages separately.

Apr 09 '23 10:04 CryptoMitch

some good ideas in this issue. Would welcome a PR.

Apr 09 '23 22:04 eshellman

does this issue still needs to be fixed

Oct 01 '23 09:10 nerdberg792

Can i work on this issue. Thankyou...

Oct 07 '23 02:10 JatinSainiOO7

free-programming-books
free-programming-books copied to clipboard

Solve some problems with RTL Languages

The base discussion on this issue starts on this PR https://github.com/EbookFoundation/free-programming-books/pull/6706 and https://github.com/EbookFoundation/free-programming-books/pull/6715

What is the issue ?

Is that it ?, No! The monster will show up below 😢

Mixing RTL with LRT languages issue !

Case 1

Case 2

Solution ?

Solve case 1

Solve case 2

Extra Cases!

Case 1

Case 1.1

Case 2

Case 3

free-programming-books free-programming-books copied to clipboard

Solve some problems with RTL Languages

The base discussion on this issue starts on this PR https://github.com/EbookFoundation/free-programming-books/pull/6706 and https://github.com/EbookFoundation/free-programming-books/pull/6715

What is the issue ?

Is that it ?, No! The monster will show up below 😢

Mixing RTL with LRT languages issue !

Case 1

Case 2

Solution ?

Solve case 1

Solve case 2

Extra Cases!

Case 1

Case 1.1

Case 2

Case 3

free-programming-books
free-programming-books copied to clipboard