m2c icon indicating copy to clipboard operation
m2c copied to clipboard

Should non-ASCII contents in .asci blocks be rejected?

Open Fuuzetsu opened this issue 1 year ago • 4 comments

I've made an issue at https://github.com/Decompollaborate/spimdisasm/issues/121 which shows that there are sometimes .asciz blocks emitted with non-ASCII characters in them.

m2c then happily takes them and outputs bytes from UTF-8 encoding (as far as I can tell).

That is, given asciz "奩" it will happily spit out e5 a5 a9 00. A brief grep brings me to parse_ascii_directive which at the very top says something about being wrong w.r.t. encodings: I guess this is exactly the issue it's talking about? I see few lines later a very explicit c.encode("utf-8"). Is this what MIPS assemblers would usually do or would they just interpret the whole input as ASCII to start with? I don't know.

Fuuzetsu avatar May 10 '23 14:05 Fuuzetsu