adlfs
adlfs copied to clipboard
Set blob Content-Type in AzureBlobFileSystem.put()
Is it possible to set the Content-Type attribute (or other content settings) of blobs using adlfs?
I would like to do something like this:
fs = AzureBlobFileSystem(...)
fs.put('index.html', 'foo/bar', content_settings={'Content-Type': 'text/html'})
This does seem to be possible when using the azure storage library directly. For example:
blob_client = ...
with open(index.html, "rb") as data:
blob_client.upload_blob(data, content_settings=ContentSettings(content_type='text/html'))
I suspect this is possible today but the above example does not work so I'm looking for examples and/or documentation to support this sort of workflow.
cc @orianac
That method take **kwags but silently ignores them. I think we could pass them through to BlobClient.upload_blob at https://github.com/fsspec/adlfs/blob/master/adlfs/spec.py#L1575-L1581.
IMO, it'd be best to just have users use the azure.storage.blob objects rather than trying to wrap them / infer what they mean from something like content_settings={"Content-Type": ...}. So I'd suggest just supporting
fs.put('index.html', 'foo/bar', content_settings=azure.storage.blob.ContentSettings(...))
by passing through kwargs unmodified.
Not sure if there are any complications to do this on azure blog storage, but just as passing-by note, s3fs does this automatically by guessing the mime type (and supports overriding, if you need a specific content type): https://github.com/fsspec/s3fs/blob/736ee5ae7a16494e14fa12838cd963e1472afb9e/s3fs/core.py#L967-L970
https://github.com/fsspec/filesystem_spec/issues/916 is an upstream issue for standardizing this. For now, I'm going to pass through kwargs so that we at least have the option to set it using azure.storage.blob.ContentSettings.
Actually, just passing through kwargs isn't going to do it. They're silently dropped at https://github.com/fsspec/filesystem_spec/blob/504483cec8a1d7f9afb77d9b6b637f40f74b5a81/fsspec/spec.py#L661-L664.
So I'm probably just going to keep working around this for now :/
Fixed by #392.