Vulnerability in API
piexif.load accepts both filenames and raw data.
elif maybe_image and data[0:2] in (b"\x49\x49", b"\x4d\x4d"): # TIFF
self.tiftag = data
elif maybe_image and data[0:4] == b"RIFF" and data[8:12] == b"WEBP":
self.tiftag = _webp.get_exif(data)
elif maybe_image and data[0:4] == b"Exif": # Exif
self.tiftag = data[6:]
else:
with open(data, 'rb') as f:
The malefactor can craft a file with arbitrary data in exif chunk of an image.
im = Image.open("Image.jpg")
exif = piexif.load(im.info["exif"])
As a result piexif will try to open a file with the path stored in im.info["exif"] and also will try to read it.
Solution
piexif should newer try to open files when name is bytes, not str.
That's a great observation. This is probably part of the reason most python I/O libraries have separate load(file: IO) and loads(data: bytes) and dump/s function calls. The other is avoiding path/data ambiguity, which can have some really nasty corner cases (a POSIX path can contain essentially any byte, save for 0x00 IIRC). Basically there is no viable, reliable, unambiguous way to determine if a collection of bytes is a path or just data.
There's a vulnerability for this now getting picked up by threat scanners. FYI.
https://security.snyk.io/vuln/SNYK-PYTHON-PIEXIF-2312874