multiparty icon indicating copy to clipboard operation
multiparty copied to clipboard

Add support for multipart/mixed content type

Open npryce opened this issue 7 years ago • 11 comments

npryce avatar Apr 18 '18 09:04 npryce

Currently the module rejects any multipart content that is not multipart/form-data or multipart/related. We need to write a service that receives data from a client that posts data with the `multipart/mixed' content type. Multiparty is the only JS library we could find for parsing multipart data that exposes part headers, which are required for resolving CID URL links between parts.

Why multipart/mixed? Multipart/related is intended for content in which the parts are organised into a hierarchy, with parent/child relationships represented by CID URLs and Content-ID headers (the content type can have an optional start parameter that identifies the root of the hierarchy). Multipart/mixed is for content that contains multiple parts that are not organised into a hierarchy, but may be interlinked.

All multipart content types have the same basic format, but different content types indicate different semantics of how a receiver is meant to interpret the parts. So no changes to the parser code is required to support multipart/mixed.

I can add a test for mixed bodies, but the content type doesn't need different parsing logic, so I picked a single simple example to run through with the multipart/mixed content type to avoid duplication. I looked into making all the tests for success cases run for all supported content types, but considered it to be too much work (given my time constraints) and too large a change to send in one PR.

npryce avatar Apr 18 '18 12:04 npryce

I'm not sure it is true that they are all the same. https://github.com/pillarjs/multiparty/issues/175 seems to say otherwise and we were going to drop the related one since it doesn't actually work with this module unless someone is willing to make the parser changes so it works. Can you link me to the specification for mixed?

dougwilson avatar Apr 18 '18 12:04 dougwilson

Multipart/mixed is defined here: https://tools.ietf.org/html/rfc2046#section-5.1.3

npryce avatar Apr 18 '18 12:04 npryce

Awesome, thanks! I'll read over this this weekend and get back to you 👍

dougwilson avatar Apr 18 '18 12:04 dougwilson

Thanks.

npryce avatar Apr 18 '18 12:04 npryce

Another question for you real quick is how can I replicate making the requests from the browser upload? I'm curious for two reasons: (1) to make sure I understand how to reproduce for future support requests and also (2) because the way this module is currently parsing filenames is actually completely off the spec because web browsers didn't actually correctly follow it so we compensated through trial and error. Does the same broken parsing apply to mixed as well, or should we have mixed follow the spec?

dougwilson avatar Apr 18 '18 13:04 dougwilson

To test our use of Multiparty, I generated a multipart/mixed request in Java using the MultipartRequestEntity class of Apache HttpClient. Multiparty parsed it perfectly, the filenames were parsed correctly, and content ids were accessible via the part headers, which was the key feature we needed.

npryce avatar Apr 18 '18 16:04 npryce

Can you share the Java code and how I can use it? What happens if your filename has "%22" in the name with the mixed type? Does it parse correctly? I would assume it should parse through as %22. For example parse your mixed response with some software that currently parses mixed and see if it sees the filename the same way this module does.

dougwilson avatar Apr 18 '18 16:04 dougwilson

I'll check and let you know.

npryce avatar Apr 18 '18 16:04 npryce

Awesome. Maybe can you teach me how to fish, so to say? I can help and I'll need to know in the future anyway to maintain this and field issues on it as well 👍i would try all possible characters in the filename.

dougwilson avatar Apr 18 '18 17:04 dougwilson

Hi. I am struggling with receiving a multipart/related request that contains a binary zip file. No matter how I extract the file from the incoming request, the file is corrupted. I was hoping that this PR might do it, but I can see that its not really implemented. If anyone has a general idea of how to implement parsing of multiplart/related, I could have a crack at writing it.

Any ideas, or pointers in the right direction?

adamhaeger avatar Jun 06 '18 08:06 adamhaeger