mcap icon indicating copy to clipboard operation
mcap copied to clipboard

Proposal for MIME type encoding

Open alkasm opened this issue 1 year ago • 20 comments

Public-Facing Changes

This PR proposes a new well-known encoding to the spec for MIME types.

Description

Channels that pass data without a schema may contain content with a known MIME type. This PR proposes mime as a supported channel message encoding, with the schema encoding name referencing the MIME type/subtype directly.

There may be other ways to achieve this that are preferred, but this seemed like a good way to start a conversation about it.

alkasm avatar Aug 04 '22 05:08 alkasm

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Aug 04 '22 05:08 CLAassistant

I think the “binary data” reference is confusing this a bit. Whether data is binary or utf8 or ascii doesn’t change the discussion.

That nit aside, the way I’ve been thinking about self-describing schemaless data such as h264 video is there would be no schema at all, and the channel encoding would be the IANA-registered MIME type.

jhurliman avatar Aug 04 '22 05:08 jhurliman

I guess in my mind the message encoding is the mime type, rather than the message encoding being mime. The message encoding should tell you what format the binary data in that message is.

I think we originally tried using actual mime types for the existing standardized message encodings, I can't remember why that idea got dropped (probably because there is no mime type for "ros 1 message"). But it seems reasonable to add some other well known ones for h264, jpg, etc.

As @jhurliman mentioned, there is no need for a schema for those formats (h264, jpg) because they are already self describing.

amacneil avatar Aug 04 '22 05:08 amacneil

Both of your comments helped me understand the difference between the channel message encoding and schema encoding better; thanks for that! It makes sense that the channel message encoding could be sufficient here. I guess I tried to use the schema to help disambiguate between the message encodings which aren't mime types and those that are---but that might not be necessary.

alkasm avatar Aug 04 '22 05:08 alkasm

Yeah - it's probably not clear from the current set of recommended channel/schema encodings, but schemas are intended to be optional. For self-describing messages like json or jpeg there is no need for a schema.

The reason that channel and schema encodings are specified separately is that they don't always match 1:1. For example:

  • json messages might use either https://json-schema.org/ or https://typeschema.org/ to define their schema.
  • cdr messages (ROS 2) might use either ros2msg or ros2idl to define their schema

Those combinations aren't supported in Foxglove Studio today, but we wanted the flexibility.

amacneil avatar Aug 04 '22 05:08 amacneil

Some thoughts:

The purpose of the message_encoding field is to tell you what binary serialization format the message data is in - specially with the goal of "how to deserialize it". Sometimes message_encoding is still not enough (for example protobuf), and you need an additional schema. In mcap files, the pairing of schema+message_encoding should be sufficient to deserialize the message data now and forever.

We could consider mime as one of those situations if we use mime for message_encoding - but then what type of Schema record do you pair it with? We could say that the schema_encoding would be mime or text/plain and the _schema_ is image/jpeg. Tho what would be the name? That's one approach we could take.

The other is to leverage the schema-less feature of channel records and use a schema id of 0. Then the message_encoding would need to be something like image/jpeg or mime:image/jpeg whatever we decide.

defunctzombie avatar Aug 04 '22 17:08 defunctzombie

@defunctzombie yeah, your second paragraph was my reasoning for the PR as it was initially proposed. Since "mime" isn't an encoding I agree with the above consensus that a schema isn't necessary; the mime type itself should be sufficient as the message encoding. I do like the mime: prefix. I guess "media types" is the currently preferred nomenclature though: https://www.iana.org/assignments/media-types/media-types.xhtml

Media Types (formerly known as MIME types)

so perhaps media: as a prefix? All mime types are of the form type/subtype; there's always a slash. Potentially that is enough?

alkasm avatar Aug 07 '22 04:08 alkasm

I'm in favor of just dropping the mime: / media: prefix, and specifying in the spec that implementations should interpret any unknown message encoding as a media type.

I think the only reason we didn't just use media types to specify message encoding is that there are none registered for the initial encodings we wanted to support (ros 1, ros 2 cdr, protobuf, flatbuffer) - only json is registered. But using them going forward seems sensible.

amacneil avatar Aug 07 '22 21:08 amacneil

implementations should interpret any unknown message encoding as a media type

This seems like a strange fallback. What if next year we add support for a ros3 message encoding? Old tools will treat that as "unknown media type"?

jtbandes avatar Aug 08 '22 16:08 jtbandes

They would treat it like an unknown type, same as they do today.

If you want to be more specific, we could say that any message encoding containing a / is assumed to be a media type, and others should come from our shorthand list.

But also, if we add ros3 next year maybe we should just use media type syntax going forward and use something like application/x-ros3-msg?

amacneil avatar Aug 08 '22 16:08 amacneil

They would treat it like an unknown type, same as they do today.

I guess in a world where we use the media type syntax for all types, that would make sense.

Would we use media types for both message encoding and schema encoding (when a schema is used)?

maybe we should just use media type syntax going forward and use something like application/x-ros3-msg?

I thought x- wasn't a thing anymore 😅 https://www.rfc-editor.org/rfc/rfc6648.html

jtbandes avatar Aug 08 '22 17:08 jtbandes

I thought x- wasn't a thing anymore 😅 https://www.rfc-editor.org/rfc/rfc6648.html

It seems like their answer is x- is no longer necessary because we made the registration process easier. So if we are to copy that model, we should similarly recommend against using non-standard encodings, and instead encourage users to register any custom type they are using in our appendix (possibly with a vnd. prefix if it is company-specific).

Would we use media types for both message encoding and schema encoding (when a schema is used)?

I mean in theory what we are trying to do is already solved by media types. In practice, there is no registered media type for "protobuf filedescriptorset" or "jsonschema" or "concatenated ros1 msg files".

So it seems like we either need to go and register a bunch of media types, or we need to have some "override" shorthand values that are not registered media types.

Turning this into a concrete proposal, we could say something like:

The message_encoding and schema_encoding must be interpreted as either (a) if it does not contain a forwardslash, a well known encoding registered in our spec appendix, or (b) if it contains a forwardslash, a well known media type. Implementers are free to put non-standard data in the message or schema encoding fields, but are strongly encouraged to register their string in one of these two databases.

Thoughts?

amacneil avatar Aug 08 '22 23:08 amacneil

Some prior art from gRPC, a typical call uses application/grpc+proto (not registered with IANA).

from https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md#requests:

Content-Type → "content-type" "application/grpc" [("+proto" / "+json" / {custom})]

alkasm avatar Aug 10 '22 06:08 alkasm

Closing now that #563 has landed

jhurliman avatar Sep 29 '22 16:09 jhurliman

@jhurliman is #563 separate from the ask here? #563 is a rename of our existing language. This PR wanted to explore expanding the spec to allow media_type as the message encoding for channels.

defunctzombie avatar Sep 29 '22 16:09 defunctzombie

@alkasm as your use of mcap has evolved do you still think this issue is worth exploring?

defunctzombie avatar Sep 29 '22 16:09 defunctzombie

@defunctzombie you're right that the attachment field name is not the same as this issue.

For now, we have standardized on accepting a mimetype as the message encoding, with no schema encoding, i.e. language similar to:

  • Channel message_encoding: MUST be one of protobuf, json, or
  • Schema encoding: MUST be one of protobuf or "" (empty string)

We're primarily using the mimetype for imagery or video data, e.g. video/h264 or image/jpeg, and also some raw data streams come through as application/octet-stream.

alkasm avatar Oct 04 '22 05:10 alkasm

Would it be worth adding to our spec appendix that well known media types are explicitly allowed in the message encoding field? E.g. image/jpeg seems like a no-brainer to me.

video/h264 I also think would be worth explicitly stating in the appendix how we expect it to be stored with respect to timestamps.

amacneil avatar Oct 05 '22 06:10 amacneil

video/h264 I also think would be worth explicitly stating in the appendix how we expect it to be stored with respect to timestamps.

Not just timestamps - but also which format (annexb or avcc) and how many NAL packets. I can't say with certainty since I've not done enough research on it but my quick read of the video/h264 media type does not lead me to think it is sufficient as the message_encoding value.

In my experiments making a web viewer for h264 data in an mcap file I used the following message encodings which I would assume we'd define in the mcap well-known spec. We could do the same for video/h264 but that might not align 100% with media type video/h264.

image

That aside - I do think there is value in being clear in the spec about media type use within message_encoding. It seems like a nice way to allow for storing images and other well-known formats as messages.

defunctzombie avatar Oct 05 '22 17:10 defunctzombie

We would need to be careful with the wording, "explicitly allowed" is not quite right because we don't disallow strings that are not IANA registered media types. Maybe provide an example of using image/jpeg with schema_id=0 to convey that this is a good practice.

For video, I think we need to keep researching and provide a working proof of concept before adding to the spec. We need to answer whether video/h264 is sufficient, or if it should be video/h264; codecs="avc1.4d002a", or something even more specific (ex: messages contain NAL Access Units in Annex-B format where Decode Order equals Output Order, i.e. no B-frames).

jhurliman avatar Oct 05 '22 22:10 jhurliman