streaming
streaming copied to clipboard
Passing jpegs constructed from byte streams crashes with `FileNotFoundError: [Errno 2] No such file or directory: ''`
Environment
I don't believe this bug is environment dependent
To reproduce
This line of code: https://github.com/mosaicml/streaming/blob/v0.5.2/streaming/base/format/mds/encodings.py#L417
assumes that Image
s created from byte-streams will have hasattr(obj, 'filename') == False
, however I believe they will actually have obj.filename == ''
. This will then throw an exception on with open(obj.filename, 'rb') as f
, since the filename
is empty-string.
Here's a minimal script you can use to prove this:
(pillowenv) devbox-david-84f94b7bb8-4hrlq% cat image_foo.py
from PIL import Image
from io import BytesIO
# Open the JPEG image using Pillow
with Image.open('/some/path.jpg') as img:
img_bytes_io = BytesIO()
# Save the image to the BytesIO object in JPEG format
img.save(img_bytes_io, format='JPEG')
image_data = img_bytes_io.getvalue()
image_bytes_io = BytesIO(image_data)
image_opened = Image.open(image_bytes_io)
print(image_bytes_io)
print(image_opened)
print(hasattr(image_opened, 'filename'))
print(image_opened.filename)
(pillowenv) devbox-david-84f94b7bb8-4hrlq% python image_foo.py
<_io.BytesIO object at 0x7fbe49122450>
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=640x480 at 0x7FBE49115850>
True
You can see it has the filename
attribute, and printing it is empty string. I imagine the fix is just to check if the filename is empty string?
Expected behavior
Additional context
this was my temporary workaround with monkey-patching till the bug-fix:
def jpeg_encode_without_filename(self: JPEG, obj: Image.Image) -> bytes:
self._validate(obj, Image.Image) # pylint: disable=protected-access
out = BytesIO()
obj.save(out, format="JPEG")
return out.getvalue()
JPEG.encode = jpeg_encode_without_filename.__get__(JPEG(), JPEG) # pylint: disable=no-value-for-parameter
Hey @davidabrahams1 and @sagnak, is this still an issue you're seeing?