Register a MIME type for the Parquet format.
There is currently no MIME type registered for Parquet. Perhaps this is intentional.
If it is not intentional, I suggest steps be taken to register a MIME type with IANA.
https://www.iana.org/assignments/media-types/media-types.xhtml
Reporter: Mark Wood
Note: This issue was originally created as PARQUET-1889. Please see the migration documentation for further details.
Thomas Champagne: Any news on a possible registration of a MIME type for the parquet format ?
I propose :)
application/parquet
Weston Pace / @westonpace:
application/parquet would be cool but might be a bit of a challenge. The root namespace is technically reserved for IETF standards or recognition from a "standards related organization" (whatever that means). application/vnd.apache.parquet would probably be pretty trivial to register though and would be similar to application/vnd.apache.thrift.xyz and application/vnd.apache.arrow.xyz
Xinli Shang / @shangxinli: +1 on @westonpace's point
Bryce Mecum / @amoeba: It looks like a request to IANA to register application/vnd.apache.parquet was submitted sometime early in 2023, as evidenced by this entry in the Parquet dev ML: https://lists.apache.org/thread/lrfsjhzoq20o95z5zn9zyrb8rdolqzz7. It looks like IANA has requested changes on the initial application so I'll keep an eye on the ML and update here when we can close this.
Bryce Mecum / @amoeba: I've just submitted a request to IANA for application/vnd.apache.parquet. I'll update in this thread as that progresses.
Bryce Mecum / @amoeba: The request has been received, given a ticket number of 1358674 with IANA, and sent off for a review.
Bryce Mecum / @amoeba: The registration is done, and is now available: https://www.iana.org/assignments/media-types/application/vnd.apache.parquet.
I think it would be good if someone from the Parquet PMC could forward a note about this to the entire PMC asking for a quick review. I did my best filling in everything but would appreciate a review of the entire registration but specifically the "Security considerations" and "Interoperability considerations" portions. Any comments can be directed either here or to me at [email protected] and I will update the registration accordingly.
Gang Wu / @wgtmac: @amoeba Thanks for your effort! Does it support encrypted parquet file? Or should it be a separate MIME type?
cc PMCs for more advise [[email protected]] @gszadovszky @ggershinsky
Bryce Mecum / @amoeba: Hi @wgtmac,
When you say "encrypted parquet file", do you mean Parquet files that use Parquet's modular encryption? I think this media type registration covers that case so another media type (or profile) wouldn't be needed. I made mention of this use case for Parquet in the registration. That said, I would like to consider your question thoroughly. Are there any media types that have a similar mechanism to Parquet's we could look at?
Gang Wu / @wgtmac: I'm not familiar with other media types. From your registration link I found the following words:
Additional information: 2. Magic number(s): PAR1 The reason I asked about encryption is that encrypted file has a different magic number PARE.
Bryce Mecum / @amoeba: ah, I didn't know that. You might be right then, and thanks for catching it. I'll look into it further to see whether another media type makes sense and I welcome any others' thoughts on it too.
Gidon Gershinsky / @ggershinsky:
Agreed,
Additional information: 2. Magic number(s): PAR1
should be
Additional information: 2. Magic number(s): PAR1, PARE
(encrypted parquet files can have either magic number, depending on the encryption mode).
Otherwise, LGTM.
Gabor Szadovszky / @gszadovszky: I agree with @ggershinsky's suggestion. LGTM, otherwise.
Bryce Mecum / @amoeba: Thanks all. I've submitted a change request to IANA to add the extra magic number. I'll update here when that change is active.
Bryce Mecum / @amoeba: The registration has been updated to include PARE was a possible magic number as discussed here. See https://www.iana.org/assignments/media-types/application/vnd.apache.parquet.
I think this can be closed. Thanks everyone for the help.