quarkus icon indicating copy to clipboard operation
quarkus copied to clipboard

RESTEasy Reactive - Server side multipart implementation is not being respected

Open technical-debt-collector opened this issue 2 years ago • 7 comments

Describe the bug

We have a multipart endpoint which accepts two parts handled as Strings such as this:

public class MultipartUpload {

    @RestForm
    @PartType(MediaType.TEXT_PLAIN)
    public String textPart;

    @RestForm
    @PartType(MediaType.APPLICATION_OCTET_STREAM)
    public String anotherTextPart;

}

This is after we migrated to RESTEasy Reactive. With the exception of anotherTextPart being of type InputStream instead nothing has really changed since then, so essentially a multipart where we're not interested in working with the parts as files.

What we weren't aware of was that some of our users are sending us parts with a filename provided, and whenever they do these two parts end up as null. I've tried looking into the implementation and it kind of looks like the MultipartParser will store everything into a file whenever a filename is provided. Haven't really understood the code completely but my guess is that the implementation will then fail to retrieve the data that was read unless the multipart is defined with File types. But I might be wrong here though.

Looking at the RFC-2183 for the Content-Disposition header, this behavior does not seem in line with the specification:

The presence of the filename parameter does not force an implementation to write the entity to a separate file. It is perfectly acceptable for implementations to leave the entity as part of the normal mail stream unless the user requests otherwise. As a consequence, the parameter may be used on any MIME entity, even `inline' ones. These will not normally be written to files, but the parameter could be used to provide a filename if the receiving user should choose to write the part to a file.

The users haven't changed anything in their implementation so it seems pretty evident that this has changed between Classic and Reactive.

Expected behavior

The server side implementation should be respected regardless of what the client sends. Unless a RESTEasy server multipart has implemented it, don't write parts of a multipart request to file, even if the Content-Disposition includes a filename. The request should still work as if the client had not provided a file name for the part in question, and such is the case in Classic.

Actual behavior

If a client sends a part with filename in the Content-Disposition header, the receiving end implementation writes the data to a file regardless of what the server side multipart implementation looks like. However, if the server side does not implement the part as a file it will not be able to retrieve the data for further use, i.e the part becomes null.

How to Reproduce?

No response

Output of uname -a or ver

No response

Output of java -version

No response

GraalVM version (if different from Java)

No response

Quarkus version or git rev

No response

Build tool (ie. output of mvnw --version or gradlew --version)

No response

Additional information

No response

/cc @FroMage, @geoand, @stuartwdouglas

quarkus-bot[bot] avatar Aug 02 '22 10:08 quarkus-bot[bot]

The users haven't changed anything in their implementation so it seems pretty evident that this has changed between Classic and Reactive.

This is correct - we explicitly mention that the multipart support has been designed from scratch.

geoand avatar Aug 02 '22 10:08 geoand

But is this something you'd consider fixing? I think it's really unfortunate that an optional header value can ruin a completely valid request. For the time being we've had to revert back to RE classic.

Your issue is that you would like things that have a filename to not be considered a file?

If so, I'm pretty sure we've discussed it before (I need to find the issue) and rejected it, but I'd be open to hearing why it's so important

geoand avatar Aug 04 '22 13:08 geoand

My issue is that I'm creating a resource method in which I don't want to handle the data as files. If I'm not mistaken, this is exactly what happens now, and it's all up to the client to decide.

The server shouldn't have to store it as a file unless the request method is implemented to handle them as such. Shouldn't the users developing the actual endpoint decide what happens with the data?

  • I guess this one isn't so bad seeing as the Quarkus default is to remove file uploads immediately after the response is returned, as long as developers are duly warned about request multipart data being written to files on their server whether they've specified it or not.
  • Generally, if decisions are made to contradict specifications then I think it should be made clear somewhere so that developers don't risk spending time implementing a library which doesn't work the way they expect them to. This might be the case though, I'm not going to claim that I've gone through all the documentation

But even if the above behavior is wanted, the resource method seemingly isn't able to use the data for some reason. As mentioned the parts are null when accessed. Why don't you think this is worth fixing?

By the way, are you thinking about this one? https://github.com/quarkusio/quarkus/issues/20938 They're pretty closely related although they are kind of opposites in this case if you're asking me.

If I'm not mistaken, this is exactly what happens now, and it's all up to the client to decide

The reason it's done the this way is to avoid having large payloads in memory.

But even if the above behavior is wanted, the resource method seemingly isn't able to use the data for some reason. As mentioned the parts are null when accessed. Why don't you think this is worth fixing

Not sure exactly what you are referring to here.

geoand avatar Aug 09 '22 06:08 geoand

While creating a reproducer I discovered this log message which clearly explains the situation: 2022-08-09 10:55:47,165 WARN [org.jbo.res.rea.ser.han.RequestDeserializeHandler] (executor-thread-0) Attribute 'anotherTextPart' of the multipart request is a file and therefore its value is not set. To obtain the contents of the file, use type 'interface org.jboss.resteasy.reactive.multipart.FileUpload' as the field type. This was of course hidden in our application due to the log configuration we're running with :)

Anyhow, the reproducer can be found here: https://github.com/technical-debt-collector/resteasy-multipart-reproducer There's a test class there that verifies the behaviour (resource methods are annotated as @Blocking so that we may intercept and print out the content headers during the requests).

After having looked at this again it seems like certain content types must be in place too (along with a filename) in order for RE Reactive to consider the part a file. If the test provides the values as text/plain it works, but if it provides the data as a byte[], then we get the log message and are left with an inaccessible part in the resource method.

In either case I still think this is unfortunate as we can't make the same multipart resource method handle both FileUploads and Strings as RE Classic would. Changing the multipart object to FileUpload as suggested by the log statement would fix the problem we're facing but would just introduce another in the form of https://github.com/quarkusio/quarkus/issues/20938.

Maybe @Sgitario can take a look at this when he has time with a fresh pair of eyes 😉

geoand avatar Aug 23 '22 09:08 geoand

To sum up, what you're requesting (and what spec conforms) is that when getting the following request:

Content-Disposition: form-data; name="textPart"; filename="my_file.txt"
Content-Type: application/octet-stream
Content-Transfer-Encoding: binary

Then, if we implement our resource as:

@Path("/example")
public class ExampleResource {

    @POST
    @Consumes(MediaType.MULTIPART_FORM_DATA)
    @Produces(MediaType.TEXT_PLAIN)
    public Uni<Response> post(@MultipartForm MultipartUpload request) {
    }
}

public class MultipartUpload {
    @RestForm
    @PartType(MediaType.TEXT_PLAIN)
    public String textPart;
}

Then, regardless of the value of the Content-Disposition and filename, the server should copy the content of the file my_file.txt into the textPart field.

Did I understand well the issue? If so, and taking into account what the spec says, I think we should indeed fix this issue.

Sgitario avatar Aug 24 '22 10:08 Sgitario

Yes, exactly.

And likewise for binary content (which works great with regular RESTEasy), something like this :

@PartType(MediaType.APPLICATION_OCTET_STREAM)
public byte[] binaryPart;

@PartType(MediaType.APPLICATION_OCTET_STREAM)
public InputStream streamPart;

teedjay avatar Aug 24 '22 13:08 teedjay

fyi https://github.com/quarkusio/quarkus/pull/27498

Sgitario avatar Aug 25 '22 11:08 Sgitario

In #27526 I changed this a bit.

The reasons are about UX:

  • From the user's PoV, whether the form is urlencoded or multipart should not matter in most cases. It's an encapsulation alternative, related to the limitations of urlencoded, but not really about semantics.
  • A String is a String. Whether it's sent urlencoded, or as a file item in plain text or in binary. If you want a (decoded) String you should be able to obtain it. Its @PartType should make no difference. Neither should it make any difference if it's a file item or a form item, urlencoded or multipart.
  • The same holds true of binary types such as byte[] or InputStream.
  • These types are special, like FileUpload, File and Path, and don't go through ParamConverter or MessageBodyReader because it wouldn't make any sense.

So, I've kept the new features of this PR, but extended it to make it more general.

FroMage avatar Aug 30 '22 08:08 FroMage