django-storages icon indicating copy to clipboard operation
django-storages copied to clipboard

File was not opened in read | write mode with the S3Storage backend

Open atleta opened this issue 4 years ago • 4 comments

I saw that this issue was reported at least twice in the past few years, but it probably was never properly described and thus got closed. It seems that once the file is open in either mode (read or write) you can't switch to the other.

First of all, it's not immediately obvious that you shouldn't be able to open it multiple times. E.g. given

class MyModel(models.Model):
    some_field = ImageField()

and m = MyModel,objects.first() if you use m.some_field for a read operation (e.g. to process the image) then it gets opened in read mode by default. But then there is no way to write back to it. I've tried it the following ways:

f=m.some_field.open('w')
f.write('some content')

This could work, though one has to be aware of the magic Django does with FileFields (without that you'd expect this to return a new file-like object, whereas open really influences the field, which kind of looks like a file). But it could work, it would just have to change mode (basically probably close the file, and reopen it, causing it to flush if the transition is w -> r). Django doesn't do it either, though. (It seeks to 0.)

However, closing the file manually doesn't work either:

m.some_field.close()
f=m.some_field.open('w')
f.write('some content')

This causes the same exception. Indeed, the field is left in a bogus, state, it's probably not aware that it was closed (m.some_field.closed returns False even after doing a m.some_field.close()). read still works, though it seems to reopen the file (at least probably reaches out to S3 as the call takes some time to return). This happens with the open('w') call (after closing the file) as well but the file is still stuck in read mode and doesn't allow writing.

(I'm using an ImageField in the example because that's what I'm getting the error with, but I'm 99% sure it's the same with a plain FileField).

The only solution, for now, seems to be refreshing the model instance from the db, but that's slow and inconvenient. (Now I see that django actually doesn't allow re-opening a closed file, which is sad, but even then, closing the file should close the file and also conflicting file mode opens should be rejected. But that's likely a django bug.)

atleta avatar Jan 18 '21 05:01 atleta

Hey buddy, I kinda figured out what's happening.

The doc does not say s3file.open is supported at all by this library.

  • When a file is opened in "w"/"r" mode, the "mode" is stored in self._mode,
  • When you call s3file.open, because the S3Boto3StorageFile does not implement an open method, it calls django.core.files.File.open, which does not change self._mode, see https://docs.djangoproject.com/en/4.0/_modules/django/core/files/base/

I attempted this (not thoroughly tested), it seems to allow me to open file with read mode then write mode.

You can extend S3Boto3StorageFile then implement the open method like so:

    def open(self, mode='rb'):
        if mode != self._mode:
            self.close()
            self.__init__(self.name, mode, self._storage, self.buffer_size)

Then create a storage class that uses this file class. I hope this issue eventually gets resolved, but looking at the date of your issue creation I'm not very optimistic :(

rabbit-aaron avatar Mar 11 '22 14:03 rabbit-aaron

Minimal steps to reproduce this issue:


storage = S3Boto3Storage(...)
my_file = storage.open("test.txt", "w")
my_file.write("hi")
my_file.close()
my_file.open("r")
print(my_file.read())

exception:

Traceback (most recent call last):
  File "/.../main.py", line 0, in <module>
    main()
  File "/.../main.py", line 12, in main
    print(my_file.read())
  File "/.../lib/python3.8/site-packages/storages/backends/s3boto3.py", line 153, in read
    raise AttributeError("File was not opened in read mode.")
AttributeError: File was not opened in read mode.
Django 3.1.14
django-storages 1.12.3

rabbit-aaron avatar Mar 11 '22 14:03 rabbit-aaron

Hey @jschneier, love this lib and I've been using it since 2018! Is there any plan to support s3file.open? if so I could attempt to implement and submit a pull request.

rabbit-aaron avatar Mar 12 '22 01:03 rabbit-aaron

https://github.com/jschneier/django-storages/pull/1145

yywing avatar Jun 22 '22 11:06 yywing

I'm fighting the django file system all the time, it's insane, I feel like I'm walking in circles. They have problem with this use-case even when using default FileStorage. Unfortunately here the issue persists, and @rabbit-aaron workaround doesn't seem to be working.

inst.data.file.file._file # smells a bit?

Looks weird doesn't it? I don't think I'm wrong when I say that most user code expects the file/storage system to be stateless, meaning it's not caching any state (mode, buffer). So when I open the file for reading, then close it, I naturally expect that all state is lost. So next time I open it, I get a fresh file, be it for reading or writing. Django doesn't do that and it's buggy. I think django-storages can do better.

For any sane person, always use it like this:

with inst.data.storage.open(inst.data.name, 'rb') as f:
    ... # do read stuff
with inst.data.storage.open(inst.data.name, 'wb') as f:
    ... # do write stuff

skrat avatar Dec 06 '22 18:12 skrat

Was bitten by this issue today. I fixed it by just deleting the .file after exiting the with block. But ideally we'd not need to do this. I spent a little time trying to understand what's going on.

I'm not sure if it's relevant, but in S3Boto3StorageFile we have:

def close(self):
    # Partly omitted..
    if self._file is not None:
        self._file.close()
        self._file = None

When _file is closed, self.closed property is True. Once it's set to none, self.closed is False. When the Django core File is then open()ed again, we end up doing seek(0) instead of resetting the file. Can anything be done at the django-storages layer to update the self._mode on open?

EDIT: I'll add that Django's docs explicitly state support for reopening with a different mode. To make the django-storages abstraction be less leaky, I'd be in favor of keeping those semantics.

ref: https://docs.djangoproject.com/en/4.1/ref/files/file/#django.core.files.File.open

jaycle avatar Mar 24 '23 18:03 jaycle

I think the correct fix here would be to override S3Boto3StorageFile.open, right? The default with the django.core.files.base.File that hits the local filesystem doesn't make any sense in a cloud context. But calling self._storage.open does. I'll accept a PR with this fix.

Not sure if we need the re-seeking? If not, we'd want to close the file then I suppose. So there is some nuance.

I think failing that, @skrat has the correct approach.

jschneier avatar Mar 26 '23 02:03 jschneier

My attempt: https://github.com/jschneier/django-storages/pull/1227

rabbit-aaron avatar Mar 28 '23 12:03 rabbit-aaron