typst icon indicating copy to clipboard operation
typst copied to clipboard

Fix duplicated and incomplete Thai characters at line endings

Open OverflowCat opened this issue 1 year ago • 4 comments

Fixes #3427

OverflowCat avatar Jun 28 '24 17:06 OverflowCat

At find_safe_to_break function will call self.glyphs.binary_search_by to find an index of glyph with start range equal to text_index. But binary_search_by says “If there are multiple matches, then any one of the matches could be returned.” so when we find “end” to the “right”, we get any index of the next grapheme cluster and break at that point, getting some extra glyph from next grapheme cluster. If we find “end” and to the “left”, we will get a first glyph’s index of next grapheme cluster to break.

https://github.com/typst/typst/blob/main/crates/typst/src/layout/inline/shaping.rs#L518-L519

so change from:

let right = self.find_safe_to_break(end, Side::Right)?;

to:

let right = self.find_safe_to_break(end, Side::Left)?;

may be solve?

Marisada avatar Jul 07 '24 03:07 Marisada

Given this, passing in towards doesn't make a difference. Do I still need to keep this part of the logic?

OverflowCat avatar Jul 07 '24 18:07 OverflowCat

binary_search_by without towards will break grapheme cluster.

ex: glyph ranges of นกยางบินกันกิน

นก : 0..3, 3..6 ยาง : 6..9, 9..12, 12..15 บิน : 15..15, 15..21, 21..24 กัน : 24..24, 24..30, 30..33 กิน : 33..33, 33..39, 39..42

binary_search_by 24 will be 24..30 (not 24..24) and break to นกยางบินก ันกิน

Marisada avatar Jul 08 '24 13:07 Marisada

At find_safe_to_break function will call self.glyphs.binary_search_by to find an index of glyph with start range equal to text_index. But binary_search_by says “If there are multiple matches, then any one of the matches could be returned.”

That is indeed the reason. But as the issue demonstrates, it means the 15..15 range will be picked up by both lines.

I'll need to think some more to figure out how best to fix this.

laurmaedje avatar Jul 22 '24 14:07 laurmaedje

I'll close this as the change set as-is doesn't fix the issue, it just shifts it.

laurmaedje avatar Aug 11 '24 20:08 laurmaedje