canmatrix icon indicating copy to clipboard operation
canmatrix copied to clipboard

Many errors are reported when parsing the dbc with Chinese characters and special characters

Open Liluoquan opened this issue 2 years ago • 4 comments

When I use canmatrix to load DBC with signals containing Chinese characters and special characters, like: _matrix = canmatrix.formats.dbc.load(f, dbcImportEncoding=encoding) errors reported like this:

error with line no: 2004
b' SG_ PSDCU_RR\xe4\xb8\xbb\xe8\xbd\xaf\xe4\xbb\xb6\xe7\x89\x88\xe6\x9c\xac\xe5\x8f\xb7$_W : 63|8@0+(1,0)[0|255] "" Vector__XXX\r\n'

the original line like this: SG_ 冗余制动降级状态$_W : 23|3@0+(1,0)[0|7] "" Vector__XXX then I find canmatrix use regex to match each line in the dbc, it uses the following regex when processing lines starting with'SG_': pattern = r"^SG_ +(\w+) *: *(\d+)\|(\d+)@(\d+)([\+|\-]) *\(([0-9.+\-eE]+), *([0-9.+\-eE]+)\) *\[([0-9.+\-eE]+)\|([0-9.+\-eE]+)\] +\"(.*)\" +(.*)" regex group (\w+) cannot match Chinese characters or special characters in python3.8, so I suggest to change the regex above into: pattern = r"^SG_ +(\S+) *: *(\d+)\|(\d+)@(\d+)([\+|\-]) *\(([0-9.+\-eE]+), *([0-9.+\-eE]+)\) *\[([0-9.+\-eE]+)\|([0-9.+\-eE]+)\] +\"(.*)\" +(.*)" To adapt to the scenarios mentioned in the issue. Please reply, it's very important to me!

Liluoquan avatar Nov 24 '23 02:11 Liluoquan

Hi @Liluoquan

you have to specify the encoding "dbcImportEncoding".

maybe something like dbcImportEncoding="utf8"

ebroecker avatar Nov 27 '23 16:11 ebroecker

Hi @Liluoquan

any success?

ebroecker avatar Dec 04 '23 13:12 ebroecker

Hi @ebroecker sorry, it didn't work when i use utf-8, GB2312 or gbk: _matrix = canmatrix.formats.dbc.load(f, dbcImportEncoding='utf-8') The error is as follows:

error with line no: 28
b' SG_ \xca\xfd\xd7\xd6\xd6\xa4\xca\xe9\xb4\xe6\xb4\xa2\xb9\xca\xd5\xcf$_W : 20|1@0+(1,0)[0|1] "" Vector__XXX\r\n'

Liluoquan avatar Dec 07 '23 01:12 Liluoquan

Hi @Liluoquan

I did not read your issue completely the fist time - sorry.

You already provided a potential fix. Thanks for it! I'll add your provided fix soon.

ebroecker avatar Dec 12 '23 12:12 ebroecker