jmeter
jmeter copied to clipboard
fix: The HTTP Header of `multipart/form-data` no longer includes `charset`.
Description
close #6250 .
Motivation and Context
On some web server implementations, including charset in the request header Content-Type
of multipart/form-data
can result in parsing errors of the boundary
, leading to a failure in sending form content.
This is inconsistent with the behavior in JMeter 5.6.2 and also does not comply with RFC.
Perhaps fixing this in the httpclient would be a more elegant choice, but:
- The last update on httpclient 4.x was 2 years ago, so it would be slow.
- This issue occurred in JMeter 5.6.3 (#5987). This PR will revert JMeter's behavior to 5.6.2, reducing user frustration.
How Has This Been Tested?
- A unittest case
- runGUI then e2e test
Screenshots (if appropriate):
After : remove charset
Types of changes
- Bug fix (non-breaking change which fixes an issue)
Checklist:
- [x] My code follows the code style of this project.
- [ ] I have updated the documentation accordingly.
It seems that the fachbook homepage doesn't return 200, but redirects to the login page.
I'm not sure if we need to modify the test case ResponseDecompression.jmx
.
Debug info:
curl -v https://www.facebook.com
* Trying 31.13.70.36:443...
* Connected to www.facebook.com (31.13.70.36) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* CAfile: /etc/ssl/certs/ca-certificates.crt
* CApath: /etc/ssl/certs
* TLSv1.0 (OUT), TLS header, Certificate Status (22):
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS header, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS header, Finished (20):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.2 (OUT), TLS header, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_CHACHA20_POLY1305_SHA256
* ALPN, server accepted to use h2
* Server certificate:
* subject: C=US; ST=California; L=Menlo Park; O=Meta Platforms, Inc.; CN=*.facebook.com
* start date: Dec 26 00:00:00 2023 GMT
* expire date: Mar 25 23:59:59 2024 GMT
* subjectAltName: host "www.facebook.com" matched cert's "*.facebook.com"
* issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=DigiCert SHA2 High Assurance Server CA
* SSL certificate verify ok.
* Using HTTP2, server supports multiplexing
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* Using Stream ID: 1 (easy handle 0x563b8d507b60)
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
> GET / HTTP/2
> Host: www.facebook.com
> user-agent: curl/7.81.0
> accept: */*
>
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
< HTTP/2 302
< set-cookie: fr=0m99cpMhrD8XneYOy..Bl99ys..AAA.0.0.Bl99ys.AWVSv6D5HHg; expires=Sun, 16-Jun-2024 06:18:20 GMT; Max-Age=7776000; path=/; domain=.facebook.com; secure; httponly
< location: https://www.facebook.com/login/?next=https%3A%2F%2Fwww.facebook.com%2F
I believe the core of the issue is that Apache Http Client
adds the excessive encoding header: https://github.com/apache/httpcomponents-client/blob/54900db4653d7f207477e6ee40135b88e9bcf832/httpmime/src/main/java/org/apache/http/entity/mime/MultipartEntityBuilder.java#L215-L217
However, it does use the provided in multipartEntityBuilder.setCharset(charset);
charset in https://github.com/apache/httpcomponents-client/blob/54900db4653d7f207477e6ee40135b88e9bcf832/httpmime/src/main/java/org/apache/http/entity/mime/MultipartEntityBuilder.java#L228 => https://github.com/apache/httpcomponents-client/blob/54900db4653d7f207477e6ee40135b88e9bcf832/httpmime/src/main/java/org/apache/http/entity/mime/HttpBrowserCompatibleMultipart.java#L69
How about fixing the issue in HTTP client instead?
As a temporary workaround, we could clone MultipartEntityBuilder
into JMeter's codebase and remove the offending paramsList.add(new BasicNameValuePair("charset", charsetCopy.name()));
, however, the removal of setCharset
does not sound right to me as it would cause using ASCII
via MIME.DEFAULT_CHARSET
which is just wrong.
the core of the issue is that Apache Http Client
I completely agree;
How about fixing the issue in HTTP client instead?
Perhaps the issues in the Http Client should be fixed within the Http Client itself, and the changes are substantial.
he removal of setCharset does not sound right to me as it would cause using ASCII via MIME.DEFAULT_CHARSET which is just wrong.
I'm not sure where the mistake lies in doing so, could you provide an example? Thank you.
If I understand correctly, the actual effect of MultipartEntityBuilder.setCharset
is only on the HTTP header and does not affect the processing of the HTTP body.
For multipart/form-data
, its request content encoding is unrelated to the request headers and instead resides within the various parts of the HTTP Body.
These UTF-8 texts work fine.
So, according to RFC 7578, perhaps HttpClient shouldn't provide the setCharset
method? Or perhaps JMeter shouldn't invoke setCharset
?
It turns out, before HttpClient handles the Charset issue properly, JMeter just needs to refrain from calling the setCharset
method to avoid adding charset=utf-8
to the request header (although HttpClient remains at the core of the issue). This doesn't affect the correctness of the HTTP body content (there might be other impacts not mentioned, feel free to point them out).
Or perhaps JMeter shouldn't invoke setCharset?
As I said earlier, MultipartEntityBuilder.setCharset
was needed there for HttpBrowserCompatibleMultipart.java
to work, as there's no other option to configure charset for HttpBrowserCompatibleMultipart
mode 🤷
Thank you, I understand your concern now.
If the setCharset
method is not called, HttpBrowserCompatibleMultipart
receives charsetCopy
as null
.
Furthermore, HttpBrowserCompatibleMultipart.charset
== 'US-ASCII'.
Is this ?
However, from the results, it still works fine:
There is no charset='US-ASCII'
in the HTTP header.
The content in the HTTP body is also UTF-8.
I don't know the reasons behind it, nor do I know if it's stable.
Thank you again for everything you have done!