Use `termscrapper`'s `Screen` to interpret the escape sequences read by `pexpect` before prompt/type matching

Open eldipa opened this issue 3 years ago • 1 comments

Describe the feature you'd like byexample calls pexpect to read from the interpreter and wait until a given regex matches.

In the most common case the regex is the prompt of the interpreter so byexample can use this to get in and stay in sync with the interpreter and knows which output came since the last time.

This works quite well but it gets into troubles if the interpreter writes escape codes.

In the best case this will not interfere with prompt matching but just will make the output dirty.

In the worst case, byexample will never see the prompt because pexpect will never match its regex.

The problem is that byexample, via pexpect, is working at the raw output level and do a terminal emulation only after being in sync.

The raw output should be passed through the terminal emulation as soon is read and before the regex scan.

pexpect uses a Expecter class and two StringIO buffers to hold and drive the searching process.

These two buffer could be replaced by a special TermBuffer which it is a linear view of termscraper's Screen (to-be implemented)

The API of StringIO to be reimplemented is not large. Here are a draft of how each method could be reimplemented using a Screen.

    - buf.truncate  =>  screen.reset
    - buf.write     =>  screen.feed
    - buf.seek      =>  screen.move_cursor
    - buf.tell      =>  screen.get_cursor
    - buf.read      =>  screen.buffer

The challenge is on buf.getvalue. For a StringIO, getvalue returns the whole data in the buffer. But for Screen, the screen is always pre-filled with spaces and it is not what the caller of buf.getvalue will be expecting.

A better approximation could be get from Screen all the data from (0 0) to the current cursor's position. That should represent exactly what data was written "in the buffer" but we still have the problem that each line in the screen will be right-padded with a lot of artificial spaces.

Another solution could make the Screen one row height. The idea is that a prompt is always in one line so getting the data written "in the buffer" would be as simple as getting the data from (0 0) to (x 0) where x is the x coordinate of the cursor. Notice that the only possible artificial spaces are located after (x 0) so they will not matter.

Assuming that a prompt is always in one line, it may work then. But prompts are not the only regexs to be searched. To support +type, arbitrary text is converted to regex and searched and this may span more than one line.

Once those problems are fixed there is another issue. pexpect's Expecter may assume that if receives a data of length L and it writes it to the buffer, the length of the buffer increased by L.

This is true for StringIO but not necessary for Screen:

if the data has escape sequences that does not really write anything, the length of the screen will increase by a smaller amount than the expected.
if the data has escape sequences that move the cursor to the right, the length of the screen will increase by a larger amount than the expected (filled with spaces).
if the cursor is moved to the left then, it could be perfectly possible that the resulting length of the screen will be smaller that its previous size. Writing data in a buffer made it to shrink!!

That's totally unexpected. pexpect's Expecter should be reimplemented, if such thing is possible.

Additional context (optional) This feature may not be worth it as it only solves a problem which a current workaround works reasonable well.

However there are cases that fail.

Suppose the following:

>>> import questionary
>>> questionary.text("What's your first name").ask()    # byexample: +type +term=ansi
<...> first name [John]
'John'

Before the [John] input, the prefix found is "first name " (notice the space at the end). That space at the end however does not show up in the output so the prefix is never matched.

The problem is the interpreter does not write a space but writes an escape sequence to move the cursor one place to the right.

The question written and its answer are echoed back. My understanding is that questionary cleans the screen and overwrites the line so visually there is no problem but affects the +type thing.

This is what we've got:

? What's your first name[John]
? What's your first name John
'John'

This is because with +type byexample emulates the echo of the response and that adds a \r\n so the real output from the example that clears the current line, clears the incorrect one so we end up with two duplicated lines.

Removing the emulated-echo almost fixes the situation. Here is what we've got after:

What's your first name John
'John'

Once 2 is fixed, the example still fails because the expected has [John] while the output has John. This makes totally sense because the [ ] are put manually by byexample in the emulated echo. Perhaps the correct fix here would be remove the [ ] from the expected regex and from the emulated echo.

Perhaps the expected regex should never have the typed text so byexample should not even try to echo it.... but breaks if the example is who does the echo.

(1) is totally under the scope of this issue, (2) and (3) are outside but they describe a real use case where the echoing and the escape sequences affects byexample.

Sep 14 '22 13:09 eldipa

Related to #180 . Since 11.0.0 we can emulate a very small set of sequences before pexpect reads and processes the output but most of the sequences are filtered out (ignored). This works nice for +term=dumb but it is not the full solution.

Oct 28 '22 01:10 eldipa