pygdbmi
pygdbmi copied to clipboard
String parsing regression with v0.10.0.2
Hi,
I'm seeing the following regression with release 0.10.0.2 (the previous one worked fine). I'm issuing a GDB MI command as follows (setting a field of a struct):
-data-evaluate-expression "((my_add_t*)0x200002f8)->paddA = 254"
The response coming back from gdb is like this:
1030^done,value="254 '\376'"
The function _unescape_internal in gdbescapes.py raises an exception when executin this line:
replaced = octal_sequence_bytes.decode("utf-8")
Exception:
'utf-8' codec can't decode byte 0xfe in position 0: invalid start byte
You can directly reproduce/trigger this issue with:
from pygdbmi.gdbescapes import _unescape_internal
_unescape_internal("1030^done,value=\"254 '\376'\"", expect_closing_quote=True)
Is this change in behavior intended? If yes, what would I need to do to get re-enable the old behavior?
Thank you.
Environment:
OS: Windows 10 pygdbmi version: 0.10.0.2
Previously, pygdbmi would either ignore or mangle escapes (depending on where they appeared) but this was fixed in 0.10.0.2. See https://github.com/cs01/pygdbmi/issues/57 and https://github.com/cs01/pygdbmi/issues/58.
The consequence is that now pygdbmi expects the output from GDB to be well formed.
Long story short, this seems like a bug in GdbController, not the unescaping code.
I will investigate this but, in the meantime, please use 0.10.0.1.
Details
If I do this in my GDB:
-data-evaluate-expression "s[1] = 254"
I get:
=memory-changed,thread-group="i1",addr="0x00005555555592a1",len="0x1"
^done,value="-2 '\\376'"
Note that double slash before 376.
And with GdbController (I'm using a trivial program where s is a char *):
gdbmi = GdbController()
gdbmi.write("-file-exec-and-symbols [...]")
gdbmi.write("break [...]")
gdbmi.write("run")
print(gdbmi.write('''-data-evaluate-expression "s[1] = 254"''')[1]["payload"]["value"])
The output is -2 '\376'
Where did the other \ go?!
Thanks for having a look that quickly!
For the moment I've pinned my stuff to 0.10.0.1.
In my previous comment I got confused but the behaviour, with my GDB, is correct. I got confused as I thought I was printing the repr of the string but I added a print() so I printed the str version of the response's payload's value.
From what I can see here everything works correctly.
I'm using this trivial program (in a file calles strings.c):
#include <stdio.h>
static void print_string(char *the_string)
{
printf("%s\n", the_string);
}
int main()
{
char the_string[] = "Hello world!";
print_string(the_string);
return 0;
}
Compiled with:
gcc strings.c -g -o strings_app
Then, in an interactive Python session, I'm doing:
from pygdbmi.gdbcontroller import GdbController
gdbmi = GdbController()
gdbmi.write(
[
"-file-exec-and-symbols strings_app",
"-break-insert print_string",
"-exec-run",
]
)
Then I try -data-evaluate-expression:
print(gdbmi.write(['-data-evaluate-expression "the_string[0] = 254"'])[-1]["payload"]["value"]) # → -2 '\376'
print(gdbmi.write(['-data-evaluate-expression "the_string[0] = 32"'])[-1]["payload"]["value"]) # → 32 ' '
print(gdbmi.write(['-data-evaluate-expression "the_string[0] = 10"'])[-1]["payload"]["value"]) # → 10 '\r'
print(gdbmi.write(['-data-evaluate-expression "the_string[0] = 65"'])[-1]["payload"]["value"]) # → 65 'A'
From what I can see everything is correct.
I'm using GDB 9.2 on Ubuntu 20.04.
Do you have more details that could help reproducing?
Hi,
I checked my CICD envrionment. Indeed, it seems that only my Windows 10 job was failing with release 0.10.0.2 wiht the error I reported. The Linux (Ubuntu 20.04) did not fail. I'll drill deeper in the coming days and report back.
On both, Windows and Linux, I'm using gdb 8.3.1 from the GNU Arm Embedded Toolchain verison 9-2020-q2 from https://developer.arm.com/downloads/-/gnu-rm The target is a Cortex-M0+ MCU system.
I think this may be a GDB bug on Windows. Potentially only affecting a few versions of GDB.
I will try to work it around.
Thank you for your patience! I'm working on a few fixes/improvements and then release this fix.