cle Fails to load PE binaries with non-utf-8 decodable bytes in section name

Fails to load PE binaries with non-utf-8 decodable bytes in section name

Open AlexVanMechelen opened this issue 2 years ago • 8 comments

Description

Loading a PE binary with non-utf-8 decodable bytes in the section name of one of its sections causes a crash here

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd2 in position 6: invalid continuation byte

Could add errors='ignore' flag to .decode() to drop non-utf-8 decodable bytes

Nov 08 '23 19:11 AlexVanMechelen

@rhelmot I've been thinking about this for a while. Should we use bytes instead of str for section and segment names?

Nov 08 '23 19:11 ltfish

My question for OP is: are your section names encoded in some other encoding, or are they garbage?

Nov 08 '23 20:11 rhelmot

@rhelmot The section names are garbage. Some executable packers create such garbage section names, leading to the above error for all executables packed with them.

Nov 08 '23 21:11 AlexVanMechelen

Does any compiler support generating utf-8 section names? If so, I would recommend adding the error-replace utf-8 decoding. If not, Latin-1.

Nov 08 '23 23:11 rhelmot

@rhelmot Some malware intentionally makes their section names garbage. I don't think we want to fail to load those binaries in such cases.

Nov 09 '23 00:11 ltfish

Neither of those solutions will fail with garbage bytes.

Nov 09 '23 00:11 rhelmot

Why don't we default to latin-1?

Nov 09 '23 05:11 ltfish

That's why I asked the question about whether compilers let you generate utf8 section names manually

Nov 09 '23 05:11 rhelmot

cle cle copied to clipboard

Fails to load PE binaries with non-utf-8 decodable bytes in section name

Description

cle
cle copied to clipboard