[Bug]: Mixing Latin1 and UTF8 Encoding in Image Attribute Parameters
Is there an existing issue for this?
- [X] I have searched the existing issues and checked the recent builds/commits
What happened?
Im trying to use oiio to read the attribute parameters of images that the library generates. The encoding conventions of this library seem to be a mix of Latin1 and UTF8.
Steps to reproduce the problem
oiio code: ` ImageBuf inp; ... const ImageSpec& spec = inp.spec();
OIIO::ustring parameters; spec.getattribute("Parameters", TypeString, ¶meters); `
For example, I have noticed that one of space character encoded as %C2%A0 in UTF8 is encoded as %A0 in Latin1. When I pass a Latin1 encoded string containing this space character to the JSON module for processing, it crashes.
I am relatively new to Python and appreciate any guidance on this matter. Thank you.
What should have happened?
Perhaps the code should be standardised to utf8 for other developers
Commit where the problem happens
Nothing
What platforms do you use to access the UI ?
No response
What browsers do you use to access the UI ?
No response
Command Line Arguments
Nothing
List of extensions
No
Console logs
Nothing
Additional information
No response
That's PIL's doing.
https://github.com/python-pillow/Pillow/blob/eccc1e9afb4b2b012772abdad026e1efa49b0e9f/src/PIL/PngImagePlugin.py#L337-L342
Here's how to read it with PIL.
from PIL import Image
image = Image.open("image.png")
parameters = image.info["parameters"]
In somewhat related news, if you want to use magick's identify to read UTF8 image parameters, it can (now) as it was patched in a nightly build. Not sure if it's hit a release version yet or not. exiftool can as well.