fonttools icon indicating copy to clipboard operation
fonttools copied to clipboard

Invalid character in name field

Open olivierberten opened this issue 11 years ago • 7 comments

ttx crashes on importing invalid character from xml.

One font has this

    <namerecord nameID="5" platformID="3" platEncID="1" langID="0x409">
      Converter: Windows Type 1 Installer V1.0d.&#65535;Font: V1.0
    </namerecord>

when passed through ttx. When this is sent back to ttx, it crashes on an Expat error "xml.parsers.expat.ExpatError: reference to invalid character number"

olivierberten avatar Jun 12 '14 14:06 olivierberten

Ugh. Not sure there's much we can do here. Unfortunately some systems read some parts of the Unicode standard too strongly, ie. in this case, disallowing U-FFFF although there's no practical reason to do that.

behdad avatar Jun 12 '14 22:06 behdad

At least fontTools could catch the error instead of crashing... or change to another xml parsing lib ;-)

olivierberten avatar Jun 13 '14 07:06 olivierberten

Perhaps we should catch this before writing the XML out, and write a binary encoded version. I have a patchset for that. Will dig it out.

behdad avatar Jul 21 '14 17:07 behdad

Ugh. Python's own Unicode routines don't have any problem with decoding to U-FFFF:

$ python3 Python 3.2.3 (default, Feb 27 2014, 21:31:18) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information.

b'\xef\xbf\xbf'.decode('utf-8') '\uffff'

We really have to go out of our way to fix this. I'm going to add an option to dump name entries "raw", which should make them never fail roundtripping.

behdad avatar Jul 21 '14 17:07 behdad

We really have to go out of our way to fix this. I'm going to add an option to dump name entries "raw", which should make them never fail roundtripping.

@behdad I'm currently experiencing the same problem. I can't seem to find the option to dump the name entries "raw", did this option ever make it?

junjie avatar Feb 22 '23 03:02 junjie

did this option ever make it?

It doesn't look like it unfortunately.

behdad avatar Feb 22 '23 16:02 behdad

#3868 allows namerecords containing U+FFFF to pass, although it may not be considered a fix of this issue.

knutnergaard avatar Jul 16 '25 08:07 knutnergaard