adlfs icon indicating copy to clipboard operation
adlfs copied to clipboard

Set blob Content-Type in AzureBlobFileSystem.put()

Open jhamman opened this issue 3 years ago • 4 comments

Is it possible to set the Content-Type attribute (or other content settings) of blobs using adlfs?

I would like to do something like this:

fs = AzureBlobFileSystem(...)
fs.put('index.html', 'foo/bar', content_settings={'Content-Type': 'text/html'})

This does seem to be possible when using the azure storage library directly. For example:

blob_client = ...
with open(index.html, "rb") as data:
    blob_client.upload_blob(data, content_settings=ContentSettings(content_type='text/html'))

I suspect this is possible today but the above example does not work so I'm looking for examples and/or documentation to support this sort of workflow.

cc @orianac

jhamman avatar Jan 29 '22 00:01 jhamman

That method take **kwags but silently ignores them. I think we could pass them through to BlobClient.upload_blob at https://github.com/fsspec/adlfs/blob/master/adlfs/spec.py#L1575-L1581.

IMO, it'd be best to just have users use the azure.storage.blob objects rather than trying to wrap them / infer what they mean from something like content_settings={"Content-Type": ...}. So I'd suggest just supporting

fs.put('index.html', 'foo/bar', content_settings=azure.storage.blob.ContentSettings(...))

by passing through kwargs unmodified.

TomAugspurger avatar Jan 30 '22 13:01 TomAugspurger

Not sure if there are any complications to do this on azure blog storage, but just as passing-by note, s3fs does this automatically by guessing the mime type (and supports overriding, if you need a specific content type): https://github.com/fsspec/s3fs/blob/736ee5ae7a16494e14fa12838cd963e1472afb9e/s3fs/core.py#L967-L970

isidentical avatar Jan 30 '22 13:01 isidentical

https://github.com/fsspec/filesystem_spec/issues/916 is an upstream issue for standardizing this. For now, I'm going to pass through kwargs so that we at least have the option to set it using azure.storage.blob.ContentSettings.

TomAugspurger avatar Mar 08 '22 16:03 TomAugspurger

Actually, just passing through kwargs isn't going to do it. They're silently dropped at https://github.com/fsspec/filesystem_spec/blob/504483cec8a1d7f9afb77d9b6b637f40f74b5a81/fsspec/spec.py#L661-L664.

So I'm probably just going to keep working around this for now :/

TomAugspurger avatar Mar 08 '22 16:03 TomAugspurger

Fixed by #392.

TomAugspurger avatar Feb 09 '23 12:02 TomAugspurger