warp icon indicating copy to clipboard operation
warp copied to clipboard

Reimplement multipart and make it stream the file upload

Open paolobarbolini opened this issue 3 years ago • 3 comments

Hello, seeing how long #323 has been open I decided to give it a go myself. I should probably have asked in the issue or on Discord, but I really wanted to give this issue a shot, so here I am.

This rewrites the multipart filter to do all of the multipart decoding by itself, dropping the dependency on the multipart crate. This reduces the number of dependencies, and makes it so the body is streamed instead of having to read it entirely into memory (see #323).

No changes have been done to the public API. The only breaking change will be observed by implementations that try to read from Parts obtained before the last call to FormData::poll_next. In those cases an error will be returned.

The decoder is mostly zero copy. The only case in which data is copied is when the underlying body yields very small chunks, since the boundary searcher expects that two consecutive chunks have a combined length greater than the boundary length.

paolobarbolini avatar May 08 '21 20:05 paolobarbolini

So, I'm beyond excited at the work here! Thanks so much, this is super cool. I'd really like to fix warp's multipart filters to use full streaming. I have two meta thoughts about this though:

  • Would this make sense as a separate crate, so that others could utilize it outside of warp?
  • To have a full multipart parser, it'd be a good idea to have a sufficient test suite testing everything about it.

seanmonstar avatar Jun 14 '21 22:06 seanmonstar

Would this make sense as a separate crate, so that others could utilize it outside of warp?

A separate crate could be interesting here. Looking around crates.io there are a few other implementations but they seem a bit bloated to me, so a new crate could change that. I could move this implementation to a separate crate and update the PR when I get around to publishing it.

To have a full multipart parser, it'd be a good idea to have a sufficient test suite testing everything about it.

That's definitely something that's missing now and that I would add before calling this PR ready or publishing a crate. Another advantage of having this be a separate crate could be avoiding bloating warp's tests.

paolobarbolini avatar Jun 15 '21 14:06 paolobarbolini

I published the implementation as a separate crate and updated this PR to use it. Since I see you use the squash and merge GitHub feature I thought it would be fine to keep the commit history, so I just made a commit that removes the implementation from warp and wraps my crate instead.

paolobarbolini avatar Jun 27 '21 19:06 paolobarbolini

Is there an update on this, I feel like this PR gets some importance as it allow to drop multipart dependencies. As it is no longer updated and now it also gives a new warning in rust nightly

warning: the following packages contain code that will be rejected by a future version of Rust: buf_redux v0.8.4, multipart v0.18.0

The GitHub repo of multipart and buf_redux has been archived, as far as I know there is no current plan to update them.

Eragonfr avatar Feb 22 '23 14:02 Eragonfr

Following up on @Eragonfr 's comment. Is there any update on this? There is now a vulnerability in remove_dir_all that will go away if multipart is removed:

error[vulnerability]: Race Condition Enabling Link Following and Time-of-check Time-of-use (TOCTOU)
...
    = Solution: Upgrade to >=0.8.0
    = remove_dir_all v0.5.3
      └── tempfile v3.3.0
          └── multipart v0.18.0
              └── warp v0.3.3

willclevine avatar Mar 06 '23 19:03 willclevine

Hey folks, what do you think about using multer for multipart? It seems more up-to date and maintained than the other options.

jaysonsantos avatar Mar 17 '23 18:03 jaysonsantos

I tried fixing up the conflicts, and it just got annoying. I then tried to cherry-pick manually, which made it worse. So now, I have a new PR where I just copy and pasted the changes into the files, and made it a new commit. That's in #1031.

(Using multer would be fine, I tried for a few minutes, until I realized it only has async fns, which means I'd need to use stream::unfold and a BoxStream. I'd merge a new PR doing that, if someone wanted to do it.

seanmonstar avatar Mar 31 '23 16:03 seanmonstar