warp
warp copied to clipboard
Reimplement multipart and make it stream the file upload
Hello, seeing how long #323 has been open I decided to give it a go myself. I should probably have asked in the issue or on Discord, but I really wanted to give this issue a shot, so here I am.
This rewrites the multipart filter to do all of the multipart decoding by itself, dropping the dependency on the multipart
crate. This reduces the number of dependencies, and makes it so the body is streamed instead of having to read it entirely into memory (see #323).
No changes have been done to the public API. The only breaking change will be observed by implementations that try to read from Part
s obtained before the last call to FormData::poll_next
. In those cases an error will be returned.
The decoder is mostly zero copy. The only case in which data is copied is when the underlying body yields very small chunks, since the boundary searcher expects that two consecutive chunks have a combined length greater than the boundary length.
So, I'm beyond excited at the work here! Thanks so much, this is super cool. I'd really like to fix warp's multipart filters to use full streaming. I have two meta thoughts about this though:
- Would this make sense as a separate crate, so that others could utilize it outside of warp?
- To have a full multipart parser, it'd be a good idea to have a sufficient test suite testing everything about it.
Would this make sense as a separate crate, so that others could utilize it outside of warp?
A separate crate could be interesting here. Looking around crates.io there are a few other implementations but they seem a bit bloated to me, so a new crate could change that. I could move this implementation to a separate crate and update the PR when I get around to publishing it.
To have a full multipart parser, it'd be a good idea to have a sufficient test suite testing everything about it.
That's definitely something that's missing now and that I would add before calling this PR ready or publishing a crate. Another advantage of having this be a separate crate could be avoiding bloating warp's tests.
I published the implementation as a separate crate and updated this PR to use it. Since I see you use the squash and merge GitHub feature I thought it would be fine to keep the commit history, so I just made a commit that removes the implementation from warp and wraps my crate instead.
Is there an update on this, I feel like this PR gets some importance as it allow to drop multipart
dependencies. As it is no longer updated and now it also gives a new warning in rust nightly
warning: the following packages contain code that will be rejected by a future version of Rust: buf_redux v0.8.4, multipart v0.18.0
The GitHub repo of multipart and buf_redux has been archived, as far as I know there is no current plan to update them.
Following up on @Eragonfr 's comment. Is there any update on this? There is now a vulnerability in remove_dir_all
that will go away if multipart
is removed:
error[vulnerability]: Race Condition Enabling Link Following and Time-of-check Time-of-use (TOCTOU)
...
= Solution: Upgrade to >=0.8.0
= remove_dir_all v0.5.3
└── tempfile v3.3.0
└── multipart v0.18.0
└── warp v0.3.3
Hey folks, what do you think about using multer for multipart? It seems more up-to date and maintained than the other options.
I tried fixing up the conflicts, and it just got annoying. I then tried to cherry-pick manually, which made it worse. So now, I have a new PR where I just copy and pasted the changes into the files, and made it a new commit. That's in #1031.
(Using multer would be fine, I tried for a few minutes, until I realized it only has async fn
s, which means I'd need to use stream::unfold
and a BoxStream
. I'd merge a new PR doing that, if someone wanted to do it.