insomnia icon indicating copy to clipboard operation
insomnia copied to clipboard

plus sign (+) in query string incorrectly escaped to %2B

Open superhawk610 opened this issue 1 year ago • 1 comments

Expected Behavior

Sending a GET request to http://localhost:3000/some/path?foo=bar+baz should send the following HTTP request to the server running at port 3000:

GET /some/path?foo=bar+baz HTTP/1.1

Specifically, the query string ?foo=bar+baz should be passed through unchanged, as + belongs to the set of URI reserved characters as defined in RFC 3986, Section 2.2 (emphasis mine):

2.2. Reserved Characters

URIs include components and subcomponents that are delimited by characters in the "reserved" set. These characters are called "reserved" because they may (or may not) be defined as delimiters by the generic syntax, by each scheme-specific syntax, or by the implementation-specific syntax of a URI's dereferencing algorithm. If data for a URI component would conflict with a reserved character's purpose as a delimiter, then the conflicting data must be percent-encoded before the URI is formed.

 reserved    = gen-delims / sub-delims

 gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"

 sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
             / "*" / "+" / "," / ";" / "="

The purpose of reserved characters is to provide a set of delimiting characters that are distinguishable from other data within a URI. URIs that differ in the replacement of a reserved character with its corresponding percent-encoded octet are not equivalent. Percent- encoding a reserved character, or decoding a percent-encoded octet that corresponds to a reserved character, will change how the URI is interpreted by most applications. Thus, characters in the reserved set are protected from normalization and are therefore safe to be used by scheme-specific and producer-specific algorithms for delimiting data subcomponents within a URI.

Here's the relevant portion:

Percent-encoding a reserved character, or decoding a percent-encoded octet that corresponds to a reserved character, will change how the URI is interpreted by most applications.

Actual Behavior

This request is sent to the server running on port 3000:

GET /some/path?foo=bar%2Bbaz HTTP/1.1

%2B is the correct percent-encoded equivalent to the literal plus sign +, but the plus sign is a reserved character used to indicate whitespace in URIs and thus should not be percent-encoded. Here are examples from a few popular languages' URI parsing libraries (Elixir, Node, Python) to illustrate why this is problematic:

Erlang/OTP 25 [erts-13.0.3] [source] [64-bit] [smp:16:16] [ds:16:16:10] [async-threads:1] [jit:ns]

Interactive Elixir (1.13.4) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> URI.decode_query("foo=bar+baz")
%{"foo" => "bar baz"}
iex(2)> URI.decode_query("foo=bar%2Bbaz")
%{"foo" => "bar+baz"}
Welcome to Node.js v18.8.0.
Type ".help" for more information.
> const qs = require('node:querystring')
> qs.decode('foo=bar+baz')
{ foo: 'bar baz' }
> qs.decode('foo=bar%2Bbaz')
{ foo: 'bar+baz' }
Python 3.9.12 (main, Mar 26 2022, 15:52:10)
[Clang 13.0.0 (clang-1300.0.29.30)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from urllib.parse import parse_qs
>>> parse_qs('foo=bar+baz')
{'foo': ['bar baz']}
>>> parse_qs('foo=bar%2Bbaz')
{'foo': ['bar+baz']}

Reproduction Steps

No response

Is there an existing issue for this?

Additional Information

I'm opening a new issue to follow up on #1712. That issue referenced a specific URL that showed why this percent encoding was problematic, where the user expected 1 or more search results but got 0 because whitespaces in the query were replaced with +.

Insomnia Version

2022.5.1

What operating system are you using?

macOS

Operating System Version

macOS Monterey 12.3.1

Installation method

download from insomnia.rest

Last Known Working Insomnia version

n/a

superhawk610 avatar Aug 25 '22 19:08 superhawk610

cc @wongstein

superhawk610 avatar Aug 25 '22 19:08 superhawk610