Invalid character in name field
ttx crashes on importing invalid character from xml.
One font has this
<namerecord nameID="5" platformID="3" platEncID="1" langID="0x409">
Converter: Windows Type 1 Installer V1.0d.Font: V1.0
</namerecord>
when passed through ttx. When this is sent back to ttx, it crashes on an Expat error "xml.parsers.expat.ExpatError: reference to invalid character number"
Ugh. Not sure there's much we can do here. Unfortunately some systems read some parts of the Unicode standard too strongly, ie. in this case, disallowing U-FFFF although there's no practical reason to do that.
At least fontTools could catch the error instead of crashing... or change to another xml parsing lib ;-)
Perhaps we should catch this before writing the XML out, and write a binary encoded version. I have a patchset for that. Will dig it out.
Ugh. Python's own Unicode routines don't have any problem with decoding to U-FFFF:
$ python3 Python 3.2.3 (default, Feb 27 2014, 21:31:18) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information.
b'\xef\xbf\xbf'.decode('utf-8') '\uffff'
We really have to go out of our way to fix this. I'm going to add an option to dump name entries "raw", which should make them never fail roundtripping.
We really have to go out of our way to fix this. I'm going to add an option to dump name entries "raw", which should make them never fail roundtripping.
@behdad I'm currently experiencing the same problem. I can't seem to find the option to dump the name entries "raw", did this option ever make it?
did this option ever make it?
It doesn't look like it unfortunately.
#3868 allows namerecords containing U+FFFF to pass, although it may not be considered a fix of this issue.