gdbgui
gdbgui copied to clipboard
ignore source file decoding errors
Hello,
I'm debugging my C application and everything is working so far, as long as I'm debugging small source files. But when I have larger ones (e.g. 84428 bytes), your source window throws an error that it could not found the file. When I use the list command in the gdb console, it could show me the source code. And I have no problem with smaller files in the same location.
Thanks ahead!
Its not a problem with the file size. The files have some special chars (German ones) and python throws this error for example:
'utf-8' codec can't decode byte 0xe4 in position 15179: invalid continuation byte
When I remove it from the file, it can be loaded. Stackoverflow is full of such cases, perhaps you can come around it in the future.
The server uses the Python function open
. It uses whatever encoding is defined for the system, and if it can't decode something, it raises an error. https://docs.python.org/3/library/functions.html#open
It sounds like your system encoding
> python -c "import locale; print(locale.getpreferredencoding())"
doesn't match that file.
In any case, a good fix for gdbgui will be to not raise errors when encoding issues are raised.
Thanks for your reply. The file is encoded in UTF-8 and the system encoding is also UTF-8.
But when gdbgui would "ignore" such errors and at least display the source, would help a lot.
Thanks for your reply. The file is encoded in UTF-8 and the system encoding is also UTF-8.
I wonder why the error is occuring then. Is the character invalid?
I'm planning to change
with open(path, "r") as f:
to
with open(path, "r", errors="replace") as f:
Do you think that will fix it?
Yes, sorry. I've converted an ANSI file with Notepad++ to UTF-8 with special chars and the error was thrown again. When I remove all "invalid" chars by hand its no problem. Then I can add new special chars without a problem.
I've added 'errors="replace"' to your open call on line 688 in backend.py and it works without a problem.
Thanks you very much! And gdbgui is a fantastic tool.
Thank you for the feedback!
I found a workaroud about this issue. gdbgui just can read UTF-8 files. The file which coding style is GBK can't be loaded.
The patch is tested on centos7+gdbgui0.14.0.2.
In /usr/local/lib/python3.6/site-packages/gdbgui/server/http_routes.py ,
change:
with open(path, "r") as f:
to :
f=open(path,'rb+') content=f.read() source_encoding='utf-8' try: content.decode('utf-8').encode('utf-8') source_encoding='utf-8' except: try: content.decode('gbk').encode('utf-8') source_encoding='gbk' except: try: content.decode('gb2312').encode('utf-8') source_encoding='gb2312' except: try: content.decode('gb18030').encode('utf-8') source_encoding='gb18030' except: try: content.decode('big5').encode('utf-8') source_encoding='gb18030' except: content.decode('cp936').encode('utf-8') source_encoding='cp936' f.close() print("Codec of file is %s" % source_encoding) with codecs.open(path, "r", source_encoding) as f:
And add import codecs
infront of http_routes.py.
@cs01
http_routes.py.txt
So we do have a change, with feedback "works" which was not applied yet. I guess after the refactoring this should now go to https://github.com/cs01/gdbgui/blob/531d89890c0b4bd3bbf15d266b9ec25a2c7eebaa/gdbgui/server/http_routes.py#L55
Friendly ping.
Note: I see no issues with gdbui and iso-88591-15 encoded source files with German umlauts, but locale
shows also that this is the configured language setup and according to the Python docs for open()
the preferred encoding is the default which is used. So @Hyphen90 and @Fenglingang you may want to adjust the locale settings before opening gdbgui.
Hi unfortunately I do not have the bandwidth to address this issue. This is a hobby project and between my full time job and new child I just don’t have time to work on this for the time being. Apologies if this inconveniences you.
I mostly wondered about the state, if someone knows if that new file would be the correct place I would give a PR a try... maybe I just try it in any case.
I found a workaroud about this issue. gdbgui just can read UTF-8 files. The file which coding style is GBK can't be loaded. The patch is tested on centos7+gdbgui0.14.0.2. In /usr/local/lib/python3.6/site-packages/gdbgui/server/http_routes.py , change:
with open(path, "r") as f:
to :f=open(path,'rb+') content=f.read() source_encoding='utf-8' try: content.decode('utf-8').encode('utf-8') source_encoding='utf-8' except: try: content.decode('gbk').encode('utf-8') source_encoding='gbk' except: try: content.decode('gb2312').encode('utf-8') source_encoding='gb2312' except: try: content.decode('gb18030').encode('utf-8') source_encoding='gb18030' except: try: content.decode('big5').encode('utf-8') source_encoding='gb18030' except: content.decode('cp936').encode('utf-8') source_encoding='cp936' f.close() print("Codec of file is %s" % source_encoding) with codecs.open(path, "r", source_encoding) as f:
And addimport codecs
infront of http_routes.py. @cs01 http_routes.py.txt
it works for me!
gdbgui 0.15.2.0 has been released which should fix this issue