parquet-format icon indicating copy to clipboard operation
parquet-format copied to clipboard

Register a MIME type for the Parquet format.

Open asfimport opened this issue 5 years ago • 15 comments

There is currently  no MIME type registered for Parquet.  Perhaps this is intentional.

If it is not intentional, I suggest steps be taken to register a MIME type with IANA.

 

https://www.iana.org/assignments/media-types/media-types.xhtml

 

Reporter: Mark Wood

Note: This issue was originally created as PARQUET-1889. Please see the migration documentation for further details.

asfimport avatar Jul 28 '20 14:07 asfimport

Thomas Champagne: Any news on a possible registration of a MIME type for the parquet format ?

I propose :)


 application/parquet

asfimport avatar Oct 29 '21 09:10 asfimport

Weston Pace / @westonpace: application/parquet would be cool but might be a bit of a challenge. The root namespace is technically reserved for IETF standards or recognition from a "standards related organization" (whatever that means). application/vnd.apache.parquet would probably be pretty trivial to register though and would be similar to application/vnd.apache.thrift.xyz and application/vnd.apache.arrow.xyz

asfimport avatar Jan 04 '22 21:01 asfimport

Xinli Shang / @shangxinli: +1 on @westonpace's point

asfimport avatar Jan 11 '22 20:01 asfimport

Bryce Mecum / @amoeba: It looks like a request to IANA to register application/vnd.apache.parquet was submitted sometime early in 2023, as evidenced by this entry in the Parquet dev ML: https://lists.apache.org/thread/lrfsjhzoq20o95z5zn9zyrb8rdolqzz7. It looks like IANA has requested changes on the initial application so I'll keep an eye on the ML and update here when we can close this.

asfimport avatar Mar 08 '23 21:03 asfimport

Bryce Mecum / @amoeba: I've just submitted a request to IANA for application/vnd.apache.parquet. I'll update in this thread as that progresses.

asfimport avatar Feb 11 '24 01:02 asfimport

Bryce Mecum / @amoeba: The request has been received, given a ticket number of 1358674 with IANA, and sent off for a review.

asfimport avatar Feb 12 '24 22:02 asfimport

Bryce Mecum / @amoeba: The registration is done, and is now available: https://www.iana.org/assignments/media-types/application/vnd.apache.parquet.

I think it would be good if someone from the Parquet PMC could forward a note about this to the entire PMC asking for a quick review. I did my best filling in everything but would appreciate a review of the entire registration but specifically the "Security considerations" and "Interoperability considerations" portions. Any comments can be directed either here or to me at [email protected] and I will update the registration accordingly.

asfimport avatar Feb 14 '24 21:02 asfimport

Gang Wu / @wgtmac: @amoeba Thanks for your effort! Does it support encrypted parquet file? Or should it be a separate MIME type?

 

cc PMCs for more advise [[email protected]] @gszadovszky @ggershinsky  

asfimport avatar Feb 18 '24 05:02 asfimport

Bryce Mecum / @amoeba: Hi @wgtmac,

When you say "encrypted parquet file", do you mean Parquet files that use Parquet's modular encryption? I think this media type registration covers that case so another media type (or profile) wouldn't be needed. I made mention of this use case for Parquet in the registration. That said, I would like to consider your question thoroughly. Are there any media types that have a similar mechanism to Parquet's we could look at?

asfimport avatar Feb 18 '24 05:02 asfimport

Gang Wu / @wgtmac: I'm not familiar with other media types. From your registration link I found the following words:

Additional information: 2. Magic number(s): PAR1 The reason I asked about encryption is that encrypted file has a different magic number PARE.

asfimport avatar Feb 18 '24 06:02 asfimport

Bryce Mecum / @amoeba: ah, I didn't know that. You might be right then, and thanks for catching it. I'll look into it further to see whether another media type makes sense and I welcome any others' thoughts on it too.

asfimport avatar Feb 18 '24 06:02 asfimport

Gidon Gershinsky / @ggershinsky: Agreed, Additional information: 2. Magic number(s): PAR1 should be Additional information: 2. Magic number(s): PAR1, PARE

(encrypted parquet files can have either magic number, depending on the encryption mode).

Otherwise, LGTM.

asfimport avatar Feb 18 '24 08:02 asfimport

Gabor Szadovszky / @gszadovszky: I agree with @ggershinsky's suggestion. LGTM, otherwise.

asfimport avatar Feb 19 '24 08:02 asfimport

Bryce Mecum / @amoeba: Thanks all. I've submitted a change request to IANA to add the extra magic number. I'll update here when that change is active.

asfimport avatar Feb 27 '24 19:02 asfimport

Bryce Mecum / @amoeba: The registration has been updated to include PARE was a possible magic number as discussed here. See https://www.iana.org/assignments/media-types/application/vnd.apache.parquet.

I think this can be closed. Thanks everyone for the help.

asfimport avatar Mar 05 '24 19:03 asfimport