asciiflow
asciiflow copied to clipboard
Cannot handle CJK characters correctly
CJK characters usually require 2 visual spaces for each character instead of one visual space in terminals.
which means that, 4 Chinese characters usually has the same visual length of 8 english characters.
this figure is 4 Chinese characters and 8 english characters compared. they have the same visual length.

asciiflow2 treat CJK characters same visual length as english characters, which is wrong and breaks the figure:
in preview, each CJK characters are put into one visual space that they "joined" together and looks ugly

after export each CJK characters are showed correctly as 2 visual length but the figure breaks.

There's one thing that makes it more complicated. In terminals it is true that CJK characters have double visual length, but it may not true for web browsers or applications. for web browsers or applications it has rich text format and the visual length depends on what font it is using.
different english monospace fonts have different visual length. but when showing CJK characters that are not defined in those fonts, they will all fall back to the same default CJK font which has the same visual length.
so the web previews may still breaks the figure even if the visual length is considered unless the font was carefully chosen and defined.
but the figure will be correct after coping to terminal if the visual length is considered, and no font definition is required.
the following code will result like this in web view.
<p style="font-family: 'Courier New';">
abcdefgh<br/>
这是中文
</p>
<p style="font-family: 'Lucida Console';">
abcdefgh<br/>
这是中文
</p>

I think #61 talks about the same thing but he didn't express clearly enough.
There's a full description about CJK character length at Unicode® Standard Annex #11
Absolutely a bug, thank you for the excellently written report. I'm pretty much as white and western as they come so Chinese characters are a bit of a new area for me - is the assumption that CJK characters are exactly (or supposed to be) double the width of latin characters a standard thing? Or is it just convention?
Thanks again :)
I'm not a language expert. I don't know if my understading is correct.
AFAK all Chinese characters should be shaped like a square, and all of the characters with the same font size should have the same width and height (no difference between proportional and monospace), while latin caracters should be shaped like a tall rectangle, and may have different width even with the same font size (unless use monospace font instead of proportional).
For Japanease things get a little bit complex. Japanease uses three different character systems altogether. Some Japanease characters (which are borrowed from Chinese) should be shaped like a square(which are called "fullwidth characters"), some other characters (which are not borrowed from Chinese) should be shaped like a tall rectangle(which are called "halfwidth characters").
I'm not famillar with Korean. From some Korean text I've seen I think Korean uses fullwidth characters for text and latin characters for punctuations. Japanease also uses latin punctuations, while Chinese has its own fullwidth punctuation characters, "," instead of "," for example.
It is not required that those fullwidth characters are exactly double the width of latin characters, but when justification is important, like when writting program code, draw ASCII art, or any other situation that monospace font should be used, fullwidth CJK characters are supposed to be double the width of latin characters and the same height of latin characters, and halfwidth CJK characters are supposed to be the same width and the same height of latin characters. This is the only way to make justification correct.
fullwidth CJK characters are supposed to be double the width of latin characters and the same height of latin characters, and halfwidth CJK characters are supposed to be the same width and the same height of latin characters
This is what I was looking for. I'll make sure to incorporate this into v3.
hi , for chinese characters, actully I fixed the display and export part except import part using regular expression will be fine
f.g = function() {
var a = $("#text-tool-input").val();
N(this.state);
for (var b = 0, c = 0, d = 0;d < a.length;d++) {
"\n" == a[d] ? (c++, b = 0) : (L(this.state, this.b.add(new p(b, c)), a[d]), b++);
}
};
f.g = function() {
var a = $("#text-tool-input").val();
N(this.state);
for (var b = 0, c = 0, d = 0;d < a.length;d++) {
if("\n" == a[d]){
(c++, b = 0)
}else{
var reg = new RegExp("[\\u4E00-\\u9FA5]+","g");
if(reg.test(a[d])){
(L(this.state, this.b.add(new p(b, c)), a[d]), b=b+2);
}else{
(L(this.state, this.b.add(new p(b, c)), a[d]), b++);
}
}
}
};
hope this bug can be fixed