WhirlyGlobe icon indicating copy to clipboard operation
WhirlyGlobe copied to clipboard

Invalid UTF-8 in tile data causes crash

Open TimSylvester opened this issue 4 years ago • 2 comments

Upon rendering "Mount Nebo" in the neighborhood of St. Louis on the MapTiler topo map:

JNI DETECTED ERROR IN APPLICATION: input is not valid Modified UTF-8: illegal start byte 0xb2
    java_vm_ext.cc:578]     string: 'Mount Nebo 
    java_vm_ext.cc:578] 250 m
    java_vm_ext.cc:578] �'
    java_vm_ext.cc:578]     input: '0x4d 0x6f 0x75 0x6e 0x74 0x20 0x4e 0x65 0x62 0x6f 0x20 0x0a 0x32 0x35 0x30 0x20 0x6d 0x0a <0xb2>'
    java_vm_ext.cc:578]     in call to NewStringUTF
    java_vm_ext.cc:578]     from boolean com.mousebird.maply.MapboxVectorTileParser.parseData(byte[], com.mousebird.maply.VectorTileData, com.mousebird.maply.LoaderReturn)

In UTF-8, bytes greater than 0x7f indicate at least one additional byte in the codepoint, so a string ending in 0xb2 is invalid.

I found reference to a known issue relating to 4-byte codepoints but which was fixed in API 23, this was on 30, targeting 28.

We should be validating UTF-8 on input but, until then, it would probably be enough to pad out the memory allocations for strings with a few extra zeros so that, if the JVM UTF-8 processor walks off the end of a string it's guaranteed to find a terminating zero before invalid memory.

Screen Shot 2021-04-16 at 9 35 44 AM

Also observed near Kansas City:

    java_vm_ext.cc:578]     string: 'Skunk Hill 
    java_vm_ext.cc:578] 412 m
    java_vm_ext.cc:578] �'
    java_vm_ext.cc:578]     input: '0x53 0x6b 0x75 0x6e 0x6b 0x20 0x48 0x69 0x6c 0x6c 0x20 0x0a 0x34 0x31 0x32 0x20 0x6d 0x0a <0xb2>'

Screenshot_20210416-103114_AutoTester

TimSylvester avatar Apr 16 '21 16:04 TimSylvester

That's a wild one. Good catch.

mousebird avatar Apr 16 '21 17:04 mousebird

See also #1262

TimSylvester avatar Jun 23 '21 16:06 TimSylvester