pyte icon indicating copy to clipboard operation
pyte copied to clipboard

Screen.draw() silently drops text due to handling of SOH/STX

Open moyix opened this issue 1 year ago • 1 comments

In bash apparently \[ and \] (common in shell prompts) get turned into SOH (0x01) and STX (0x02) respectively. However, I believe they are not being handled correctly by the parser in Stream, which results in sequences like \x02text being passed to Screen.draw(), which bails on the first control character it encounters (and therefore skips over text).

import pyte
def box(lines):
    w = len(lines[0])
    tb = f"+{'-'*w}+"
    return '\n'.join([tb]+[f'|{l}|' for l in lines]+[tb])
screen = pyte.Screen(10, 3)
stream = pyte.Stream(screen)
stream.feed('hello\r\n\x01dropped\x02dropped\r\n')
print(box(screen.display))
# Prints:
# +----------+
# |hello     |
# |          |
# |          |
# +----------+

The DebugStream gives:

["draw", ["hello"], {}]
["carriage_return", [], {}]
["linefeed", [], {}]
["draw", ["\u0001dropped\u0002dropped"], {}]
["carriage_return", [], {}]
["linefeed", [], {}]

The root cause seems to be the following regex to match "plain text": https://github.com/selectel/pyte/blob/636b679d7af5267c34a75045561642f8e9c6164a/pyte/streams.py#L134-L140

Which is then used to find chunks of plain text that can be passed to draw(): https://github.com/selectel/pyte/blob/636b679d7af5267c34a75045561642f8e9c6164a/pyte/streams.py#L190-L206

So a minimal solution (which works for me) would just be to add SOH and STX to the _special set so that they aren't treated as plain text. But maybe all of the control characters (0x00-0x1F) should be excluded as well?

Here is a quick patch. Happy to make a PR: https://github.com/selectel/pyte/compare/master...moyix:pyte:moyix/fix_ctrl_chars

moyix avatar Dec 23 '24 14:12 moyix

Text after control characters being dropped is caused by https://github.com/selectel/pyte/commit/1a9b9146cb729182df9c397dc3e30a4634c45fc9#diff-e059a3b34c367de4baecc0cbdce1847f5a268d08d65ae78128a4d856df7e298cR498

Before that change, draw() just ignored control characters. With this commit, text rendering is terminated, if a control character is detected (wcwidth() returns -1).

That's not a problem if draw() is called by parser for single characters, which turns the function into a NOOP:

https://github.com/selectel/pyte/blob/61d0c0c769d7dd12957087fa2c23d231345f66aa/pyte/streams.py#L381

The following line is even redundant as draw() doesn't render CAN or SUB control characters:

https://github.com/selectel/pyte/blob/61d0c0c769d7dd12957087fa2c23d231345f66aa/pyte/streams.py#L338


Stream._text_pattern seems primarily being used to terminate directly rendered chunks of plain text, if a known control or escape sequence is found, which needs to be sent to parser for it to generate events.


As draw() doesn't render and stops at control characters, those should indeed be catched by Stream. Those which do not trigger any event, should probably just be ignored, like NUL and DEL. It's even useless to call draw() for them.

deathaxe avatar Apr 25 '25 12:04 deathaxe