Fix duplicated and incomplete Thai characters at line endings
Fixes #3427
At find_safe_to_break function will call self.glyphs.binary_search_by to find an index of glyph with start range equal to text_index. But binary_search_by says “If there are multiple matches, then any one of the matches could be returned.” so when we find “end” to the “right”, we get any index of the next grapheme cluster and break at that point, getting some extra glyph from next grapheme cluster. If we find “end” and to the “left”, we will get a first glyph’s index of next grapheme cluster to break.
https://github.com/typst/typst/blob/main/crates/typst/src/layout/inline/shaping.rs#L518-L519
so change from:
let right = self.find_safe_to_break(end, Side::Right)?;
to:
let right = self.find_safe_to_break(end, Side::Left)?;
may be solve?
Given this, passing in towards doesn't make a difference. Do I still need to keep this part of the logic?
binary_search_by without towards will break grapheme cluster.
ex: glyph ranges of นกยางบินกันกิน
นก : 0..3, 3..6 ยาง : 6..9, 9..12, 12..15 บิน : 15..15, 15..21, 21..24 กัน : 24..24, 24..30, 30..33 กิน : 33..33, 33..39, 39..42
binary_search_by 24 will be 24..30 (not 24..24) and break to
นกยางบินก
ันกิน
At find_safe_to_break function will call self.glyphs.binary_search_by to find an index of glyph with start range equal to text_index. But binary_search_by says “If there are multiple matches, then any one of the matches could be returned.”
That is indeed the reason. But as the issue demonstrates, it means the 15..15 range will be picked up by both lines.
I'll need to think some more to figure out how best to fix this.
I'll close this as the change set as-is doesn't fix the issue, it just shifts it.