filesystem_spec icon indicating copy to clipboard operation
filesystem_spec copied to clipboard

Standardize content_type handling when writing files

Open TomAugspurger opened this issue 3 years ago • 5 comments

Would it be helpful to standardize the ability to set the content type of files written by fsspec? Several backends have / are requesting the ability to set this (https://github.com/fsspec/s3fs/blob/4f289eaa34dfe8337a72f5a0148c41a44793fde0/s3fs/core.py#L971-L974, https://github.com/fsspec/adlfs/issues/294). Each backend will typically have a different "native" was for setting things. With S3 it's a ContentType keyword. With azure, it's content_settings=azure.storage.blob.ContentSettings(content_type=...).

This proposal would make content_type a proper keyword of pipe, put, (and any others writing data). The backend would be responsible for setting it appropriately.

We might also need to standardize a default behavior. s3fs uses a library to guess the content type. I'm not sure if that's appropriate (if it were, I'd think that boto / azure-storage-blob would do it?).

TomAugspurger avatar Mar 08 '22 16:03 TomAugspurger

You are right that this MIME-like thing applies to a few backends, but by no means all. I would be happy for the backends that do have this concept to support a standerdised kwargs which can be translated into whatever specific thing they each require - but of course the direct route to doing the same thing needs to remain available.

About guessing the content type... it's a little controversial, to be sure.

martindurant avatar Mar 08 '22 17:03 martindurant