pygdbmi icon indicating copy to clipboard operation
pygdbmi copied to clipboard

String parsing regression with v0.10.0.2

Open twinkler-ams-osram opened this issue 3 years ago • 5 comments

Hi,

I'm seeing the following regression with release 0.10.0.2 (the previous one worked fine). I'm issuing a GDB MI command as follows (setting a field of a struct):

-data-evaluate-expression "((my_add_t*)0x200002f8)->paddA = 254"

The response coming back from gdb is like this:

1030^done,value="254 '\376'"

The function _unescape_internal in gdbescapes.py raises an exception when executin this line:

replaced = octal_sequence_bytes.decode("utf-8")

Exception:

'utf-8' codec can't decode byte 0xfe in position 0: invalid start byte

You can directly reproduce/trigger this issue with:

from pygdbmi.gdbescapes import _unescape_internal
_unescape_internal("1030^done,value=\"254 '\376'\"", expect_closing_quote=True)

Is this change in behavior intended? If yes, what would I need to do to get re-enable the old behavior?

Thank you.

Environment:

OS: Windows 10 pygdbmi version: 0.10.0.2

twinkler-ams-osram avatar May 10 '22 14:05 twinkler-ams-osram

Previously, pygdbmi would either ignore or mangle escapes (depending on where they appeared) but this was fixed in 0.10.0.2. See https://github.com/cs01/pygdbmi/issues/57 and https://github.com/cs01/pygdbmi/issues/58.

The consequence is that now pygdbmi expects the output from GDB to be well formed.

Long story short, this seems like a bug in GdbController, not the unescaping code. I will investigate this but, in the meantime, please use 0.10.0.1.

Details

If I do this in my GDB:

-data-evaluate-expression "s[1] = 254"

I get:

=memory-changed,thread-group="i1",addr="0x00005555555592a1",len="0x1"
^done,value="-2 '\\376'"

Note that double slash before 376.

And with GdbController (I'm using a trivial program where s is a char *):

gdbmi = GdbController()
gdbmi.write("-file-exec-and-symbols [...]")
gdbmi.write("break [...]")
gdbmi.write("run")
print(gdbmi.write('''-data-evaluate-expression "s[1] = 254"''')[1]["payload"]["value"])

The output is -2 '\376'

Where did the other \ go?!

barisione avatar May 10 '22 15:05 barisione

Thanks for having a look that quickly!

For the moment I've pinned my stuff to 0.10.0.1.

twinkler-ams-osram avatar May 11 '22 09:05 twinkler-ams-osram

In my previous comment I got confused but the behaviour, with my GDB, is correct. I got confused as I thought I was printing the repr of the string but I added a print() so I printed the str version of the response's payload's value.

From what I can see here everything works correctly.

I'm using this trivial program (in a file calles strings.c):

#include <stdio.h>

static void print_string(char *the_string)
{
    printf("%s\n", the_string);
}

int main()
{
    char the_string[] = "Hello world!";
    print_string(the_string);

    return 0;
}

Compiled with:

gcc strings.c -g -o strings_app

Then, in an interactive Python session, I'm doing:

from pygdbmi.gdbcontroller import GdbController
gdbmi = GdbController()
gdbmi.write(
    [
        "-file-exec-and-symbols strings_app",
        "-break-insert print_string",
        "-exec-run",
    ]
)

Then I try -data-evaluate-expression:

print(gdbmi.write(['-data-evaluate-expression "the_string[0] = 254"'])[-1]["payload"]["value"])  # → -2 '\376'
print(gdbmi.write(['-data-evaluate-expression "the_string[0] = 32"'])[-1]["payload"]["value"])  # → 32 ' '
print(gdbmi.write(['-data-evaluate-expression "the_string[0] = 10"'])[-1]["payload"]["value"])  # → 10 '\r'
print(gdbmi.write(['-data-evaluate-expression "the_string[0] = 65"'])[-1]["payload"]["value"]) # → 65 'A'

From what I can see everything is correct.

I'm using GDB 9.2 on Ubuntu 20.04.

Do you have more details that could help reproducing?

barisione avatar May 13 '22 10:05 barisione

Hi,

I checked my CICD envrionment. Indeed, it seems that only my Windows 10 job was failing with release 0.10.0.2 wiht the error I reported. The Linux (Ubuntu 20.04) did not fail. I'll drill deeper in the coming days and report back.

On both, Windows and Linux, I'm using gdb 8.3.1 from the GNU Arm Embedded Toolchain verison 9-2020-q2 from https://developer.arm.com/downloads/-/gnu-rm The target is a Cortex-M0+ MCU system.

twinkler-ams-osram avatar May 18 '22 07:05 twinkler-ams-osram

I think this may be a GDB bug on Windows. Potentially only affecting a few versions of GDB.

I will try to work it around.

barisione avatar May 30 '22 13:05 barisione

Thank you for your patience! I'm working on a few fixes/improvements and then release this fix.

barisione avatar Aug 09 '22 07:08 barisione