vert.x icon indicating copy to clipboard operation
vert.x copied to clipboard

Provide an option to handle form fields similarly to file uploads

Open marekscholle opened this issue 2 years ago • 7 comments

For multipart POST requests marked as setExpectMultipart(true), the Vert.x policy is that POST form fields are provided as key-value String map when the request body is fully read, and for file uploads we register upload handlers to handle them in a streaming fashion, buffer-by-buffer.

With Vert.x <= 3.9.5, oversized attributes might cause OOM, from Vert.x 3.9.6 there is a limit on maximal form fields size; Vert.x refuses to collect large attributes as Strings in memory. See also https://github.com/eclipse-vertx/vert.x/issues/3794

For some specific technological system we deal with, it's possible for our customers to upload large files through form fields. This might be an unfortunate usage of POST form fields designed to deal with "small HTML form inputs", but there are tools that use POST form fields to send megabytes (and more) of data.

It would be nice if there was an option that would allow us to install "form field handler" to deal with large form fields. The default may still be the strategy "form fields should be short", but a simple boolean option to handle form fields as "anonymous file uploads" would really help us.

I'm no expert on Netty, but it looks there is one "if" in HttpPostMultipartRequestDecoder which decides whether to collect in memory (decodeMultipart(MultiPartStatus.FIELD)) or to handle files (decodeMultipart(MultiPartStatus.FILEUPLOAD)). We need to have control on this "if" to always take the FILEUPLOAD path. This probably requires a small change in Netty as well to provide this control.

marekscholle avatar Sep 15 '22 16:09 marekscholle

can you investigate whether this would be necessary in Netty and open an issue there if that's needed ?

vietj avatar Sep 16 '22 13:09 vietj

I think we may hack into netty HttpPostMultipartRequestDecoder which accepts a HttpDataFactory factory. This has methods Attribute createAttribute(...) and FileUpload createFileUpload(...). As the names suggest, createAttribute is a netty customization point to process form fields, while createFileUpload is a customization point to process uploads.

With setExpectMultipart(true), Vert.x creates a HttpPostMultipartRequestDecoder and passes NettyFileUploadDataFactory deriving from netty's DefaultHttpDataFactory. This NettyFileUploadDataFactory:

  • keeps the netty implementation of createAttribute which collects data to memory
  • overrides createFileUpload: each upload provided by netty is passed to installed upload handler

We should be able to override HttpDataFactory.createAttribute so that it returns an Attribute which in fact is a thin wrapper above NettyFileUpload:

class FileUploadAttribute implements Attribute {
    private final NettyFileUpload underlying;

    FileUploadAttribute(...) {
        // same as in NettyFileUploadDataFactory
        underlying = new NettyFileUpload(...);
        ...;
    }

    @Override
    public void addContent(ByteBuf buffer, boolean last) throws IOException {
        underlying.addContent(buffer, last);
    }
 
    ...
}

// 

class NettyFileUploadDataFactory extends DefaultHttpDataFactory {
    ...

    @Override
    public Attribute createAttribute(HttpRequest request, String name, long definedSize) {
        ...
        // return a wrapper above NettyFileUpload
        return new FileUploadAttribute(...);
    }
}

This way we can reuse file upload logic for processing form fields.

It is then a Vert.x API detail if users can install a special handler for form fields, or uploadHandler is notified with both file uploads and form fields (presented as "anonymous" file uploads).

I'm not sure if this is not an abuse of netty concept of "file uploads" and "attributes", but it looks to me it's a valid hooking into customization points provided by netty

marekscholle avatar Sep 17 '22 09:09 marekscholle

I think we can ask @franz1981 his opinion about such usage of Netty

vietj avatar Sep 19 '22 07:09 vietj

Let me summon @hyperxpro for this: we can turn this one in a proper Netty issue in order to help other users after being answered

franz1981 avatar Sep 19 '22 07:09 franz1981

@marekscholle Sounds good. We can have a thin layer to provide this functionality. Are you interested in doing a PR in Netty?

hyperxpro avatar Sep 19 '22 14:09 hyperxpro

@hyperxpro For me, this was the first time I looked into netty internals (while following Vert.x setExpectMultipart implementation). So if there was somebody more competent in netty and its development, it would be great.

If there is nobody who can do it, I can try. But I generally prefer to not contribute to projects which I don't know. I strongly believe that contributions should be done people who know the project (and which want to contribute to in the future, which is not case of me and netty)

@hyperxpro do we speak about PR into netty which would add FileUploadAttribute (an attribute delegating to a NettyFileUpload) to be installed by Vert.x in a HttpDataFactory.createAttribute?

marekscholle avatar Sep 19 '22 16:09 marekscholle

@marekscholle Yes, It will go into FileUploadAttribute. I will do the PR by end of this week (Have a little backlog of TODOs right now).

hyperxpro avatar Sep 21 '22 11:09 hyperxpro

@hyperxpro do you have ETA on this? I know this is a bold question... When you have something, please add me to the loop (for review)

marekscholle avatar Oct 05 '22 20:10 marekscholle

Sorry for ghosting but I'm on vacation. Probably next week, I will do PR and tag you for review. Sorry!

hyperxpro avatar Oct 06 '22 06:10 hyperxpro

@marekscholle I am having a hard time trying to reproduce this. Can you share the reproducer?

hyperxpro avatar Oct 19 '22 17:10 hyperxpro

Understood. I prepared a repo https://github.com/marekscholle/vertx-form-upload with a JUnit test demonstrating the difference between

  • handling form fields (no filename in part "header"), which Vert.x (from 3.9.6) accumulates to in-memory String. If the size of form field exceeds a limit, Vert.x 4 raises an error which we can handle by subscribing in ctx.response().exceptionHandler
  • handling file uploads for which we can register uploadHandler and handler the upload content by chunks

It is the Junit UploadTest test in vertx-4 branch. The repo is a standard Gradle repo you can clone, or you can copy https://github.com/marekscholle/vertx-form-upload/blob/vertx-4/src/test/java/com/github/marekscholle/UploadTest.java to some project of yours.

For completeness, there is also vertx-3.9.5 branch with Vert.x 3.9.5. It demonstrates that Vert.x 3.9.5 accepts also larger form fields, potentially risking OOM if what client sends is too large.

@hyperxpro please let me know if this is what you expected

marekscholle avatar Oct 21 '22 11:10 marekscholle

Let me repeat that the goal here is to provide an option to deal with large form fields (form fields without filename specified, not file uploads) somehow.

Today's Vert.x default behavior is the only one we can use, and it raises an error if an HTTP client sends large content as form field (not as file upload). The default behavior may be reasonable IMO (is what most Vert.x users want), but sending large content as form field is not illegal though – at least I'm not aware of some HTTP restriction which would define that from size, multipart fields must be file uploads.

marekscholle avatar Oct 21 '22 11:10 marekscholle

Thanks for the code and detailed explanation.

I did some research before pushing this to Netty but now it seems a little odd as this change doesn't look great when you compare it with other HTTTP implementations. Because we are trying to trick the HttpPostMultipartRequestDecoder to accept large payloads. Instead, we should extend and use a wrapper for this because Netty implementation is meant to be overridden for use case-specific scenarios.

hyperxpro avatar Oct 22 '22 17:10 hyperxpro