SwiftTerm
SwiftTerm copied to clipboard
Chinese characters are chopped
Given a chinese character that uses more than one cell, and another one, we get two CTRuns, one for the Chinese character, and one that contains both a space and the second character. The space when we render seems to be overwriting the second half of the Chinese character:

The patch below extracts the position of the rune relative to the line, and helps a bit, but now the last character of the line is chopped, so there is some clipping somewhere I need to find, this is what it looks like now:

diff --git a/Sources/SwiftTerm/Apple/AppleTerminalView.swift b/Sources/SwiftTerm/Apple/AppleTerminalView.swift
index 59f8643..4ba20e0 100644
--- a/Sources/SwiftTerm/Apple/AppleTerminalView.swift
+++ b/Sources/SwiftTerm/Apple/AppleTerminalView.swift
@@ -564,9 +564,13 @@ extension TerminalView {
CTRunGetGlyphs(run, CFRange(), bufferPointer.baseAddress!)
count = runGlyphsCount
}
+ let x = [CGPoint](unsafeUninitializedCapacity: runGlyphsCount) { (bufferPointer, count) in
+ CTRunGetPositions(run, CFRange(), bufferPointer.baseAddress!)
+ count = runGlyphsCount
+ }
var positions = runGlyphs.enumerated().map { (i: Int, glyph: CGGlyph) -> CGPoint in
- CGPoint(x: lineOrigin.x + (cellDimension.width * CGFloat(col + i)), y: lineOrigin.y + ceil(lineLeading + lineDescent))
+ CGPoint(x: lineOrigin.x + x [i].x /* (cellDimension.width * CGFloat(col + i))*/, y: lineOrigin.y + ceil(lineLeading + lineDescent))
}
var backgroundColor: TTColor?
I think you need to convert the whole string to a width, not just the glyphs, especially in Unicode which can have reverse text, subscripts etc.
https://developer.apple.com/documentation/foundation/nsstring/1524729-boundingrectwithsize
I think you need to convert the whole string to a width, not just the glyphs, especially in Unicode which can have reverse text, subscripts etc.
https://developer.apple.com/documentation/foundation/nsstring/1524729-boundingrectwithsize
I think it's not the string width but glyph postions in line are wrong. You just point out a right direction of this issue, because the CoreText calculate each character's position not just using the char width multiply the index of it in a line. I will try to fix this issue, good luck to me.
The patch above sort of works, but there are some additional tasks that are required involving the cursor rendering - otherwise what the cursor renders when it is blinking and the text in the terminals are out of sync.
And we have an additional problem with Emoji
Emoji Problem
If I have this text on the first column:
🍂hello
With Emacs, the contents of the buffer are:
(lldb) p buffer.lines [vy][0].getCharacter()
(Character) "🍂"
(lldb) p buffer.lines [vy][1].getCharacter()
(Character) "h"
(lldb) p buffer.lines [vy][2].getCharacter()
(Character) "e"
For vi, instead, I get this:
(lldb) p buffer.lines [vy][0].getCharacter()
(Character) "🍂"
(lldb) p buffer.lines [vy][1].getCharacter()
(Character) "\0"
(lldb) p buffer.lines [vy][2].getCharacter()
(Character) "h"
The way this happens is that vi prints the emoji first at columb, then manually sets the cursor location to column 3, and prints "hello". This is why we have a 0 in the buffer at position [1]. Emacs on the other hand, will output the string in one go, and relies on the terminal to place the character in the right place.
With Chinese characters, we have a different situation, both vi and emacs output the character directly, without doing the cursor positioning trick, so this does not pose a problem.
The challenge seems to be that we might need to detect an Emoji, and in that case, if the next column contains a zero, not insert a space
The problem with the vi case is that when the offsets are computed, the emoji is at position 0, then the space corresponding to the 0 value is given the slot in column 3, rather than column 1. So we produce this:
Instead of:
Cursor Rendering
- Will need to set the cursor size based on the glyph size
- Will need to also compute the cursor location based on the runs before the one under the cursor.
There are a few scenarios to handle.
Bad news for vi
When we query the runs, we get two runs
- The emoji
- The string
" hello"
, notice the space at the beginning.
106 glyphs for hello
and the rest of the empty string
Glyph 0 is code 3
Glyph 1.. is code 570, 534, 605, 605, 629, 485, 3, 3, 3
So the “3” is a space, the good news is that it does not render anything, but we start rendering it at the wrong position, because we have a nil there.
This new patch handles the vi special scenario, and would need to also be replicated for cursors:
https://gist.github.com/migueldeicaza/b3451fb7880def299112e5e362c94323
Perhaps an alternative is to allocate cells when the characters are generated. So when double-width characters like Chinese characters or emoji are produced, I record both the character and an empty slot after, and use this information while rendering, so I can normalize all output into a grid.
And perhaps rather than trying to create a single attributed string for the whole line, I need to break up the attributed string at these width-changing boundaries
Some of the suggestion above already existed in the codebase, and I should addressed a few scenarios for characters that did not have the right metrics. There might be more places where I need to add this information.
Given a chinese character that uses more than one cell, and another one, we get two CTRuns, one for the Chinese character, and one that contains both a space and the second character. The space when we render seems to be overwriting the second half of the Chinese character:
![]()
The patch below extracts the position of the rune relative to the line, and helps a bit, but now the last character of the line is chopped, so there is some clipping somewhere I need to find, this is what it looks like now:
![]()
diff --git a/Sources/SwiftTerm/Apple/AppleTerminalView.swift b/Sources/SwiftTerm/Apple/AppleTerminalView.swift index 59f8643..4ba20e0 100644 --- a/Sources/SwiftTerm/Apple/AppleTerminalView.swift +++ b/Sources/SwiftTerm/Apple/AppleTerminalView.swift @@ -564,9 +564,13 @@ extension TerminalView { CTRunGetGlyphs(run, CFRange(), bufferPointer.baseAddress!) count = runGlyphsCount } + let x = [CGPoint](unsafeUninitializedCapacity: runGlyphsCount) { (bufferPointer, count) in + CTRunGetPositions(run, CFRange(), bufferPointer.baseAddress!) + count = runGlyphsCount + } var positions = runGlyphs.enumerated().map { (i: Int, glyph: CGGlyph) -> CGPoint in - CGPoint(x: lineOrigin.x + (cellDimension.width * CGFloat(col + i)), y: lineOrigin.y + ceil(lineLeading + lineDescent)) + CGPoint(x: lineOrigin.x + x [i].x /* (cellDimension.width * CGFloat(col + i))*/, y: lineOrigin.y + ceil(lineLeading + lineDescent)) } var backgroundColor: TTColor?
seems we don't need the space after a Unicode character.
for example, $ cd dir && tree . . ├── 你好.txt
in the func buildAttributedString(), https://github.com/migueldeicaza/SwiftTerm/blob/4df9211e2f039e426ba02859a7d668e40212d90f/Sources/SwiftTerm/Apple/AppleTerminalView.swift#L394, the str's value will become to "你 好 .txt", there's a space between '你' and '好', and a space between '好' and '.', even we fix the space overlapping issue (I'v tried don't append a space after a unicode character), it just works well in the rendering mode, but in the selection mode, the spaces still be there, and we paste it to somewhere else, we just got "你 好 .txt" rather than "你好.txt".
Given a chinese character that uses more than one cell, and another one, we get two CTRuns, one for the Chinese character, and one that contains both a space and the second character. The space when we render seems to be overwriting the second half of the Chinese character:
The patch below extracts the position of the rune relative to the line, and helps a bit, but now the last character of the line is chopped, so there is some clipping somewhere I need to find, this is what it looks like now:
diff --git a/Sources/SwiftTerm/Apple/AppleTerminalView.swift b/Sources/SwiftTerm/Apple/AppleTerminalView.swift index 59f8643..4ba20e0 100644 --- a/Sources/SwiftTerm/Apple/AppleTerminalView.swift +++ b/Sources/SwiftTerm/Apple/AppleTerminalView.swift @@ -564,9 +564,13 @@ extension TerminalView { CTRunGetGlyphs(run, CFRange(), bufferPointer.baseAddress!) count = runGlyphsCount } + let x = [CGPoint](unsafeUninitializedCapacity: runGlyphsCount) { (bufferPointer, count) in + CTRunGetPositions(run, CFRange(), bufferPointer.baseAddress!) + count = runGlyphsCount + } var positions = runGlyphs.enumerated().map { (i: Int, glyph: CGGlyph) -> CGPoint in - CGPoint(x: lineOrigin.x + (cellDimension.width * CGFloat(col + i)), y: lineOrigin.y + ceil(lineLeading + lineDescent)) + CGPoint(x: lineOrigin.x + x [i].x /* (cellDimension.width * CGFloat(col + i))*/, y: lineOrigin.y + ceil(lineLeading + lineDescent)) } var backgroundColor: TTColor?
seems we don't need the space after a Unicode character.
for example, $ cd dir && tree . . ├── 你好.txt
in the func buildAttributedString(),
https://github.com/migueldeicaza/SwiftTerm/blob/4df9211e2f039e426ba02859a7d668e40212d90f/Sources/SwiftTerm/Apple/AppleTerminalView.swift#L394
, the str's value will become to "你 好 .txt", there's a space between '你' and '好', and a space between '好' and '.', even we fix the space overlapping issue (I'v tried don't append a space after a unicode character), it just works well in the rendering mode, but in the selection mode, the spaces still be there, and we paste it to somewhere else, we just got "你 好 .txt" rather than "你好.txt".
Yes, when the terminal builds attributed string for rendering, it appends a whitespace char after some special chars( double width chars Chinese char or Emoji, or something else). But when drawing the attributed string, the whitespace char is also drawn, it must be handled specially.
Perhaps an alternative is to allocate cells when the characters are generated. So when double-width characters like Chinese characters or emoji are produced, I record both the character and an empty slot after, and use this information while rendering, so I can normalize all output into a grid.
And perhaps rather than trying to create a single attributed string for the whole line, I need to break up the attributed string at these width-changing boundaries
If we use an empty slot, we must handle it when drawing the attributed string
CTFontDrawGlyphs(runFont, runGlyphs, &positions, positions.count, context)
Or we can use zero width space character (https://en.wikipedia.org/wiki/Zero-width_space) as an empty slot.
I tried with "\u{200B}", the rendering is ok
But the selection still has some problem