Incorrect parsing of nested params with closing square bracket ] in property name
Search parameters with closing square bracket in nested property names are parsed incorrectly.
Here demonstrated with /?search[withbracket[]]=foobar:
> var qs = require("qs");
undefined
> input = { search: { "withbracket[]": "foobar" }}
{ search: { 'withbracket[]': 'foobar' } }
> output = qs.stringify(input, { encode: false })
'search[withbracket[]]=foobar'
> result = qs.parse(output)
{ 'search[withbracket': [ 'foobar' ] }
> assert.deepEqual(input, result)
Uncaught AssertionError [ERR_ASSERTION]: Expected values to be loosely deep-equal:
{
search: {
'withbracket[]': 'foobar'
}
}
should loosely deep-equal
{
'search[withbracket': [
'foobar'
]
}
at REPL280:1:8
at Script.runInThisContext (node:vm:129:12)
at REPLServer.defaultEval (node:repl:572:29)
at bound (node:domain:433:15)
at REPLServer.runBound [as eval] (node:domain:444:12)
at REPLServer.onLine (node:repl:902:10)
at REPLServer.emit (node:events:525:35)
at REPLServer.emit (node:domain:489:12)
at [_onLine] [as _onLine] (node:internal/readline/interface:425:12)
at [_line] [as _line] (node:internal/readline/interface:886:18) {
generatedMessage: true,
code: 'ERR_ASSERTION',
actual: [Object],
expected: [Object],
operator: 'deepEqual'
}
For comparison the result with the native URL object:
> url = new URL("http://example.com/?search[withbracket[]]=foobar")
URL {
href: 'http://example.com/?search[withbracket[]]=foobar',
origin: 'http://example.com',
protocol: 'http:',
username: '',
password: '',
host: 'example.com',
hostname: 'example.com',
port: '',
pathname: '/',
search: '?search[withbracket[]]=foobar',
searchParams: URLSearchParams { 'search[withbracket[]]' => 'foobar' },
hash: ''
}
> [...url.searchParams.keys()]
[ 'search[withbracket[]]' ]
When parsing a query without braces, the result looks like:
> qs.parse("search[okay]=baz")
{ search: { okay: 'baz' } }
So it should be a nested object.
Current Result
{ "search[withbracket": ["foobar"] }
Expected Result
{ "search": { "withbracket[]": ["foobar"] }}
I recently came across this weird query string combination.
Since I also (attempt to) maintain a Python and a Dart port of this awesome library I attempted to fix it in Python and Dart respectively.
The Python fix https://github.com/techouse/qs_codec/pull/1 involved addressing the regular expression
brackets: re.Pattern[str] = re.compile(r"(\[[^[\]]*])")
which matches a left bracket [, followed by any number of characters that are not brackets, and then a right bracket ]. This is why it matches the innermost brackets and produces the same wrong result as stated above.
To match the outermost brackets, including nested brackets, we must use a recursive regular expression. However, Python's re module doesn't support recursive regular expressions. This is where I had to resort to the regex module instead to achieve this
from regex import regex
brackets: regex.Pattern[str] = regex.compile(r"\[(?:[^\[\]]|(?R))*\]")
This regular expression matches a left bracket [, followed by any number of characters that are not brackets or another matching pair of brackets, and then a right bracket ]. The (?R) is a recursive subpattern that matches the entire pattern.
In the Dart fix https://github.com/techouse/qs/pull/13 I used recursive_regex to achieve the same and get from
final RegExp brackets = RegExp(r'(\[[^[\]]*])');
to
import 'package:recursive_regex/recursive_regex.dart';
final RegExp brackets = RecursiveRegex(startDelimiter: '[', endDelimiter: ']');
I'm not sure how best to implement this fix here, since JavaScript does not provide the PCRE recursive parameter (?R), however, this StackOverflow answer suggests one could use XRegExp which supports recursive matching via a plugin.
Don't we all just love regular expressions 🙈
To be honest, this issue is quite fringe and a heavy dependency could be a dealbreaker for more people than are willing to accept the lack of this fix.
In my opinion, it's ultimately down to @ljharb to give the go-ahead on this or not.
Thoughts?
CC / @gronke
It should definitely be fixed, but I’m not excited about relying on regular expressions for a non-regular DSL :-)