vscode-ibmi
vscode-ibmi copied to clipboard
Support command to insert LRM in order to support programming in bidi languages
Is your feature request related to a problem? Please describe. The modern git-friendly approach that Code 4 i is enabling is to store source in UTF8 stream files. The compilers need to transform into EBCDIC so that their compile will work. As long the right TGTCCSID parm is specified, this will work. However some lines of RPG etc in Hebrew and Arabic are ambigous and are reversed when transformed to EBCDIC causing compile failures. The solution is to insert the Unicode \u200e Left-To-Right Mark (LRM) character and then the bidi transform knows that the line is LTR source and will transform correctly. Unfortunately VS Code does not have any built-in factility to insert LRM characters so this is a very difficult work around for Hebrew and Arabic programmers to use.
Describe the solution you'd like
The request is to have a simple Command/Action "Insert Left-To-Right Mark" which will insert the \u200e character at the cursor.
For bonus points, render the LRM in a way that does not shift everything to the right and misalign columns.
Describe alternatives you've considered A very advanced solution would detect the need for this marks and automatically insert them, but we are asking for the low hanging fruit to start.
Additional context Some lines of RPG etc in Hebrew and Arabic are ambigous and get read by the RPG Compiler/IFS Open API/iconv in reverse order because the Bidi string type of Unicode is 10 which is contextual and depends on the contents of the line. Normally this is not a problem as RPG begins with opcodes containing latin characters that are strong LTR characters. By inserting a LRM mark, the Iconv API is informed that this line is LRM and starts interpreting in the right order.
Let Hebrew be represented by CAPITAL letters. An example of an ambigious line has the logical order: // ABC : english When converted to Hebrew CCSID 424 visual ordering we want // CBA : english Spaces and punctuation are considered neutral characters that depend on their surrounding characters for ordering. Because the first strong character discovered is 'A' a Hebrew RTL character, the whole string is converted as right to left resulting in
english : CBA //.
Here is a real example
//זו דוגמא להערה עם עברית : remark
If we insert the LRM we get the correct rendering
cl: crtsrcpf liama/hebrew ccsid(424);
cl: ADDPFM FILE(LIAMA/HEBREW) MBR(HEBREW) TEXT('Hebrew');
insert into liama/hebrew values (
0, 0, x'E2D5C4D1D4C1C9D3404040E2E4C2D1C5C3E34D7D4040405551586445404468695640714668516940714651586440484644407D5D404E'
);
cl: CPYTOSTMF FROMMBR('/QSYS.LIB/LIAMA.LIB/HEBREW.FILE/HEBREW.MBR') TOSTMF('/tmp/vscodetemp-O_0p9XHiyZ') STMFOPT(*REPLACE) STMFCCSID(1208) DBFCCSID(62245);
select * from liama/hebrew;