OpenImageIO
OpenImageIO copied to clipboard
[BUG] python3.10 spec.get_string_attribute() throws a runtime exception with non-utf-8 string headers
Describe the bug
When looping over all key/value pairs in an EXR header using the python bindings to OpenImageIO, a field containing 0xFF values causes an awkward RuntimeError: Could not allocate string object!. This is very commonly seen in DI when the upstream writing tool doesn't want to set a value and instead just fills it with some default of 0xFF.
To Reproduce With the following header (as shown with iinfo -a -v):
AudioFramerate: 24
AudioInfo: "������������������������������������������������������������������������@"
AudioRunningLTC: "MOS"
Where the � character is 0xFF.
Attempting to iterate over headers like so:
>>> import OpenImageIO as oiio
>>> src = oiio.ImageBuf('./my_file.1000.exr')
>>> spec = src.spec()
>>> for a in spec.extra_attribs:
... print(f'{a.name} ({a.type}): {a.value} ({type(a.value)})')
Causes the following exception:
AudioFileName (string): (<class 'str'>)
AudioFramerate (float): 24.0 (<class 'float'>)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
RuntimeError: Could not allocate string object!
Expected behavior It appears that strings are assumed to be in utf-8 unicode, which causes an error trying to decode non-unicode data into a python string.
The workaround is to not access a.value and instead ask for the attribute directly like so:
>>> try:
... x = spec.get_string_attribute('AudioInfo')
... except RuntimeError:
... pass
However, I was hoping to have a more graceful exception of this error. For example, I cannot access the raw bytes of this value to determine if the offending field is actually usable. Perhaps adding an a.raw_value could return the information in the field as a bytestring so the above try/except has an opportunity to do something else special with the value. Since the accessing of an attribute here is forcing the coercion to unicode, I cannot get to the original data using python in any way.
Platform information:
- OIIO branch/version:
openimageio-2.4.5.0-r0 - OS:
Alpine Linux 3.17 under Docker - C++ compiler:
- Any non-default build flags when you build OIIO:
To work around this, I just switched to using spec.get_bytes_attribute(), and then attempting my own string conversions.
However, the python get_string_attribute should still not fail egregiously even if the underlying value cannot be parsed as a string. Perhaps just raising the UnicodeDecodeError directly would allow a try/except to catch this rather than the much more generic RuntimeError