m2c
m2c copied to clipboard
Should non-ASCII contents in .asci blocks be rejected?
I've made an issue at https://github.com/Decompollaborate/spimdisasm/issues/121 which shows that there are sometimes .asciz
blocks emitted with non-ASCII characters in them.
m2c
then happily takes them and outputs bytes from UTF-8 encoding (as far as I can tell).
That is, given asciz "奩"
it will happily spit out e5 a5 a9 00
. A brief grep
brings me to parse_ascii_directive
which at the very top says something about being wrong w.r.t. encodings: I guess this is exactly the issue it's talking about? I see few lines later a very explicit c.encode("utf-8")
. Is this what MIPS assemblers would usually do or would they just interpret the whole input as ASCII to start with? I don't know.