requests
requests copied to clipboard
Bad escaping of double quote in uploaded file
Uploading a file whose name contains double quotes (") in the file, requests replaces the character with %22. This is not reversible because the % char is not escaped (it's not possible to tell apart a literal %22 from a quote), and it's different from what a browser does. Browsers seem to use backslash escape instead (tested with Firefox).
Expected Result
In a shell run:
$ nc -l 0.0.0.0 8080 | grep filename
In Python
requests.post("http://localhost:8080/", files={'file': ("""foo% 'bar" \u20ac.txt""", open("README.rst"))})
Expected, what a browser returns, in the nc shell:
Content-Disposition: form-data; name="file"; filename="foo% 'bar\" €.txt"
Actual Result
Content-Disposition: form-data; name="file"; filename="foo% 'bar%22 €.txt"
System Information
$ python -m requests.help
{
"chardet": {
"version": "3.0.4"
},
"cryptography": {
"version": ""
},
"idna": {
"version": "2.8"
},
"implementation": {
"name": "CPython",
"version": "3.8.5"
},
"platform": {
"release": "5.4.0-51-generic",
"system": "Linux"
},
"pyOpenSSL": {
"openssl_version": "",
"version": null
},
"requests": {
"version": "2.24.0"
},
"system_ssl": {
"version": "1010106f"
},
"urllib3": {
"version": "1.25.8"
},
"using_pyopenssl": false
}
This is not reversible because the % char is not escaped (it's not possible to tell apart a literal %22 from a quote)
I don't think this is categorically true and there's information missing from here that isn't present. In general, the RFCs don't have specific guidance around this and so this behaviour doesn't violate any of them from my reads. In fact, none of them suggest how to handle a DQUOTE in this particular case. That said, unencoded DQUOTEs are very explicitly used to delimit the filename. So if something is unescaping the %22
for you, I'd argue that's because the RFC has undefined behaviour and that library/tool/etc is stepping beyond what it should. Further, filename
should primarily be US-ASCII
so for latin-1 characters and others, I'd expect us to try to use filename*=
which explicitly calls for URL quoting characters
I don't think this is categorically true
Maybe it is not clear, but a file called "%22
is passed by requests as %22%22
, which is categorically not reversible. Whether the standard is explicit or not about what to do is probably open to interpretation and I'm sure you know more than what I do about it :)
>>> requests.post("http://localhost:8080/", files={'file': ('"%22', open("README.rst"))})
$ nc -l 0.0.0.0 8080 | grep filename
Content-Disposition: form-data; name="file"; filename="%22%22"
if any you should percent-escape the percent too to make it reversible (yielding %22%2522
); however I can tell you that in the test suite I'm working for, in a FastAPI project, the file name recieved by the client will not be unescaped and will be seen as %22%2522
, whereas such file name is understood correctly using backslash escaping. Maybe there is a bug for them too, on the receiving side, however they grok an upload from Firefox no problem.
Yeah, it's unsurprising for a web-servers to not code to a standard in Python regrettably. Many don't support the standard for using filename*=
either. Also worth pointing out that Requests delegates this to urllib3 which I think has been updated to the latest and "greatest" HTML5 standard around multipart/form-data
so beyond the decades old standards that aren't being observed, there are newer ones that it seems like server implementers are also ignoring. (sarcastic-yay) Regardless, you're correct. The url-encoding here isn't correct.
Try this:
"maingame": { "day1": { "text1": "Tag 1", "text2": "Heute startet unsere Rundreise " Example text". Jeden Tag wird ein neues Reiseziel angesteuert bis wir. " } }
When rendering in the html it shows as "Example text". What is the correct way? May be it will help check the link in below. I found it at here
Browsers seem to use backslash escape instead (tested with Firefox).
Firefox (now) also replaces "
with %22
, so it does not backslash escape (anymore, maybe it did 2 years ago when this issue was opened). Chrome and curl do exactly the same.
So I wouldn't say requests is broken, it just works like all other clients/browsers out there. (I am not saying this is good, it's just how it is).
I am seeing this issue as well. If I try to upload a file named "file\r\n%64.txt", it gets encoded to "file%0D%0A%64.txt", which decodes back to "file\r\nd.txt". The literal percent sign in the filename should be getting percent-encoded along with the newline and carriage return.