streaming icon indicating copy to clipboard operation
streaming copied to clipboard

Passing jpegs constructed from byte streams crashes with `FileNotFoundError: [Errno 2] No such file or directory: ''`

Open davidabrahams1 opened this issue 1 year ago • 2 comments

Environment

I don't believe this bug is environment dependent

To reproduce

This line of code: https://github.com/mosaicml/streaming/blob/v0.5.2/streaming/base/format/mds/encodings.py#L417

assumes that Images created from byte-streams will have hasattr(obj, 'filename') == False, however I believe they will actually have obj.filename == ''. This will then throw an exception on with open(obj.filename, 'rb') as f, since the filename is empty-string.

Here's a minimal script you can use to prove this:

(pillowenv) devbox-david-84f94b7bb8-4hrlq% cat image_foo.py 
from PIL import Image
from io import BytesIO

# Open the JPEG image using Pillow
with Image.open('/some/path.jpg') as img:
    img_bytes_io = BytesIO()
    
    # Save the image to the BytesIO object in JPEG format
    img.save(img_bytes_io, format='JPEG')

image_data = img_bytes_io.getvalue()
image_bytes_io = BytesIO(image_data)
image_opened = Image.open(image_bytes_io)
print(image_bytes_io)
print(image_opened)
print(hasattr(image_opened, 'filename'))
print(image_opened.filename)
(pillowenv) devbox-david-84f94b7bb8-4hrlq% python image_foo.py 
<_io.BytesIO object at 0x7fbe49122450>
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=640x480 at 0x7FBE49115850>
True

You can see it has the filename attribute, and printing it is empty string. I imagine the fix is just to check if the filename is empty string?

Expected behavior

Additional context

davidabrahams1 avatar Sep 05 '23 22:09 davidabrahams1

this was my temporary workaround with monkey-patching till the bug-fix:

def jpeg_encode_without_filename(self: JPEG, obj: Image.Image) -> bytes:
    self._validate(obj, Image.Image)  # pylint: disable=protected-access
    out = BytesIO()
    obj.save(out, format="JPEG")
    return out.getvalue()


JPEG.encode = jpeg_encode_without_filename.__get__(JPEG(), JPEG)  # pylint: disable=no-value-for-parameter

sagnak avatar Sep 05 '23 22:09 sagnak

Hey @davidabrahams1 and @sagnak, is this still an issue you're seeing?

snarayan21 avatar May 29 '24 19:05 snarayan21