cpp-httplib
cpp-httplib copied to clipboard
URL query string encode not percent-encoding some characters
According to rfc3986, only unreserved characters (ALPHA / DIGIT / "-" / "." / "_" / "~" ) can remain unencoded. However, the function that encodes a query string encode_query_param() is failing to encode some characters, specifically, the sub-delims group ("!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=").
URI producing applications should percent-encode data octets that correspond to characters in the reserved set unless these characters are specifically allowed by the URI scheme to represent data in that component.
To comply with the standard, this function should encode those characters as well, so I propose the following implementation.
inline std::string encode_query_param(const std::string &value) {
std::ostringstream escaped;
escaped.fill('0');
escaped << std::hex;
for (auto c : value) {
if (std::isalnum(static_cast<uint8_t>(c)) || c == '-' || c == '.' ||
c == '_' || c == '~') {
escaped << c;
} else {
escaped << std::uppercase;
escaped << '%' << std::setw(2)
<< static_cast<int>(static_cast<unsigned char>(c));
escaped << std::nouppercase;
}
}
return escaped.str();
}
This probably conflicts with the interests of @Yrds in #788, and the specifications allows the use of un-encoded reserved characters if the specific implementation allows them, so maybe the best way to fix it would be to add an extra parameter to the function with the white-listed characters.
@PabloMK7, sorry for the late reply. Could you make a pull request of the above code with at least one unit test in test/test.cc? Thanks!
@PabloMK7, the current implementation of encode_query_param behaves exactly same as JavaScript encodeURIComponent() function. I would like to stay with it unless it really creates problems.