django-storages
django-storages copied to clipboard
File was not opened in read | write mode with the S3Storage backend
I saw that this issue was reported at least twice in the past few years, but it probably was never properly described and thus got closed. It seems that once the file is open in either mode (read or write) you can't switch to the other.
First of all, it's not immediately obvious that you shouldn't be able to open it multiple times. E.g. given
class MyModel(models.Model):
some_field = ImageField()
and m = MyModel,objects.first()
if you use m.some_field
for a read operation (e.g. to process the image) then it gets opened in read mode by default. But then there is no way to write back to it. I've tried it the following ways:
f=m.some_field.open('w')
f.write('some content')
This could work, though one has to be aware of the magic Django does with FileFields (without that you'd expect this to return a new file-like object, whereas open really influences the field, which kind of looks like a file). But it could work, it would just have to change mode (basically probably close the file, and reopen it, causing it to flush if the transition is w -> r
). Django doesn't do it either, though. (It seeks to 0.)
However, closing the file manually doesn't work either:
m.some_field.close()
f=m.some_field.open('w')
f.write('some content')
This causes the same exception. Indeed, the field is left in a bogus, state, it's probably not aware that it was closed (m.some_field.closed
returns False
even after doing a m.some_field.close()
). read
still works, though it seems to reopen the file (at least probably reaches out to S3 as the call takes some time to return). This happens with the open('w')
call (after closing the file) as well but the file is still stuck in read mode and doesn't allow writing.
(I'm using an ImageField in the example because that's what I'm getting the error with, but I'm 99% sure it's the same with a plain FileField).
The only solution, for now, seems to be refreshing the model instance from the db, but that's slow and inconvenient. (Now I see that django actually doesn't allow re-opening a closed file, which is sad, but even then, closing the file should close the file and also conflicting file mode opens should be rejected. But that's likely a django bug.)
Hey buddy, I kinda figured out what's happening.
The doc does not say s3file.open
is supported at all by this library.
- When a file is opened in "w"/"r" mode, the "mode" is stored in
self._mode
, - When you call
s3file.open
, because theS3Boto3StorageFile
does not implement anopen
method, it callsdjango.core.files.File.open
, which does not change self._mode, see https://docs.djangoproject.com/en/4.0/_modules/django/core/files/base/
I attempted this (not thoroughly tested), it seems to allow me to open file with read mode then write mode.
You can extend S3Boto3StorageFile
then implement the open method like so:
def open(self, mode='rb'):
if mode != self._mode:
self.close()
self.__init__(self.name, mode, self._storage, self.buffer_size)
Then create a storage class that uses this file class. I hope this issue eventually gets resolved, but looking at the date of your issue creation I'm not very optimistic :(
Minimal steps to reproduce this issue:
storage = S3Boto3Storage(...)
my_file = storage.open("test.txt", "w")
my_file.write("hi")
my_file.close()
my_file.open("r")
print(my_file.read())
exception:
Traceback (most recent call last):
File "/.../main.py", line 0, in <module>
main()
File "/.../main.py", line 12, in main
print(my_file.read())
File "/.../lib/python3.8/site-packages/storages/backends/s3boto3.py", line 153, in read
raise AttributeError("File was not opened in read mode.")
AttributeError: File was not opened in read mode.
Django 3.1.14
django-storages 1.12.3
Hey @jschneier, love this lib and I've been using it since 2018!
Is there any plan to support s3file.open
? if so I could attempt to implement and submit a pull request.
https://github.com/jschneier/django-storages/pull/1145
I'm fighting the django file system all the time, it's insane, I feel like I'm walking in circles. They have problem with this use-case even when using default FileStorage. Unfortunately here the issue persists, and @rabbit-aaron workaround doesn't seem to be working.
inst.data.file.file._file # smells a bit?
Looks weird doesn't it? I don't think I'm wrong when I say that most user code expects the file/storage system to be stateless, meaning it's not caching any state (mode, buffer). So when I open the file for reading, then close it, I naturally expect that all state is lost. So next time I open it, I get a fresh file, be it for reading or writing. Django doesn't do that and it's buggy. I think django-storages can do better.
For any sane person, always use it like this:
with inst.data.storage.open(inst.data.name, 'rb') as f:
... # do read stuff
with inst.data.storage.open(inst.data.name, 'wb') as f:
... # do write stuff
Was bitten by this issue today. I fixed it by just deleting the .file
after exiting the with
block. But ideally we'd not need to do this. I spent a little time trying to understand what's going on.
I'm not sure if it's relevant, but in S3Boto3StorageFile
we have:
def close(self):
# Partly omitted..
if self._file is not None:
self._file.close()
self._file = None
When _file
is closed, self.closed
property is True. Once it's set to none, self.closed
is False. When the Django core File
is then open()
ed again, we end up doing seek(0)
instead of resetting the file. Can anything be done at the django-storages layer to update the self._mode
on open
?
EDIT: I'll add that Django's docs explicitly state support for reopening with a different mode. To make the django-storages abstraction be less leaky, I'd be in favor of keeping those semantics.
ref: https://docs.djangoproject.com/en/4.1/ref/files/file/#django.core.files.File.open
I think the correct fix here would be to override S3Boto3StorageFile.open
, right? The default with the django.core.files.base.File
that hits the local filesystem doesn't make any sense in a cloud context. But calling self._storage.open
does. I'll accept a PR with this fix.
Not sure if we need the re-seeking? If not, we'd want to close the file then I suppose. So there is some nuance.
I think failing that, @skrat has the correct approach.
My attempt: https://github.com/jschneier/django-storages/pull/1227