requests icon indicating copy to clipboard operation
requests copied to clipboard

Bad escaping of double quote in uploaded file

Open dvarrazzo opened this issue 4 years ago • 6 comments

Uploading a file whose name contains double quotes (") in the file, requests replaces the character with %22. This is not reversible because the % char is not escaped (it's not possible to tell apart a literal %22 from a quote), and it's different from what a browser does. Browsers seem to use backslash escape instead (tested with Firefox).

Expected Result

In a shell run:

$ nc -l 0.0.0.0 8080 | grep filename

In Python

requests.post("http://localhost:8080/", files={'file': ("""foo% 'bar" \u20ac.txt""", open("README.rst"))})

Expected, what a browser returns, in the nc shell:

Content-Disposition: form-data; name="file"; filename="foo% 'bar\" €.txt"

Actual Result

Content-Disposition: form-data; name="file"; filename="foo% 'bar%22 €.txt"

System Information

$ python -m requests.help
{
  "chardet": {
    "version": "3.0.4"
  },
  "cryptography": {
    "version": ""
  },
  "idna": {
    "version": "2.8"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.8.5"
  },
  "platform": {
    "release": "5.4.0-51-generic",
    "system": "Linux"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.24.0"
  },
  "system_ssl": {
    "version": "1010106f"
  },
  "urllib3": {
    "version": "1.25.8"
  },
  "using_pyopenssl": false
}

dvarrazzo avatar Oct 26 '20 12:10 dvarrazzo

This is not reversible because the % char is not escaped (it's not possible to tell apart a literal %22 from a quote)

I don't think this is categorically true and there's information missing from here that isn't present. In general, the RFCs don't have specific guidance around this and so this behaviour doesn't violate any of them from my reads. In fact, none of them suggest how to handle a DQUOTE in this particular case. That said, unencoded DQUOTEs are very explicitly used to delimit the filename. So if something is unescaping the %22 for you, I'd argue that's because the RFC has undefined behaviour and that library/tool/etc is stepping beyond what it should. Further, filename should primarily be US-ASCII so for latin-1 characters and others, I'd expect us to try to use filename*= which explicitly calls for URL quoting characters

sigmavirus24 avatar Oct 26 '20 14:10 sigmavirus24

I don't think this is categorically true

Maybe it is not clear, but a file called "%22 is passed by requests as %22%22, which is categorically not reversible. Whether the standard is explicit or not about what to do is probably open to interpretation and I'm sure you know more than what I do about it :)

>>> requests.post("http://localhost:8080/", files={'file': ('"%22', open("README.rst"))})
$ nc -l 0.0.0.0 8080 | grep filename
Content-Disposition: form-data; name="file"; filename="%22%22"

if any you should percent-escape the percent too to make it reversible (yielding %22%2522); however I can tell you that in the test suite I'm working for, in a FastAPI project, the file name recieved by the client will not be unescaped and will be seen as %22%2522, whereas such file name is understood correctly using backslash escaping. Maybe there is a bug for them too, on the receiving side, however they grok an upload from Firefox no problem.

dvarrazzo avatar Oct 26 '20 14:10 dvarrazzo

Yeah, it's unsurprising for a web-servers to not code to a standard in Python regrettably. Many don't support the standard for using filename*= either. Also worth pointing out that Requests delegates this to urllib3 which I think has been updated to the latest and "greatest" HTML5 standard around multipart/form-data so beyond the decades old standards that aren't being observed, there are newer ones that it seems like server implementers are also ignoring. (sarcastic-yay) Regardless, you're correct. The url-encoding here isn't correct.

sigmavirus24 avatar Oct 27 '20 14:10 sigmavirus24

Try this:

"maingame": { "day1": { "text1": "Tag 1", "text2": "Heute startet unsere Rundreise " Example text". Jeden Tag wird ein neues Reiseziel angesteuert bis wir. " } }

When rendering in the html it shows as "Example text". What is the correct way? May be it will help check the link in below. I found it at here

YashVadhadiya avatar Nov 13 '20 16:11 YashVadhadiya

Browsers seem to use backslash escape instead (tested with Firefox).

Firefox (now) also replaces " with %22, so it does not backslash escape (anymore, maybe it did 2 years ago when this issue was opened). Chrome and curl do exactly the same. So I wouldn't say requests is broken, it just works like all other clients/browsers out there. (I am not saying this is good, it's just how it is).

mkurz avatar Nov 17 '22 23:11 mkurz

I am seeing this issue as well. If I try to upload a file named "file\r\n%64.txt", it gets encoded to "file%0D%0A%64.txt", which decodes back to "file\r\nd.txt". The literal percent sign in the filename should be getting percent-encoded along with the newline and carriage return.

dpitch40 avatar Oct 10 '23 17:10 dpitch40