undici
undici copied to clipboard
bug: header value checks are too strict in fetch
Bug Description
https://github.com/nodejs/undici/blob/0964a83710467b994cc108d096e97f69dd54ac90/lib/core/request.js#L26
The regex here is too strict for fetch
which causes a bunch of tests to fail:
{ fail: '\x01' }
{ fail: '\x02' }
{ fail: '\x03' }
{ fail: '\x04' }
{ fail: '\x05' }
{ fail: '\x06' }
{ fail: '\x07' }
{ fail: '\b' }
{ fail: '\x0E' }
{ fail: '\x0F' }
{ fail: '\x10' }
{ fail: '\x11' }
{ fail: '\x12' }
{ fail: '\x13' }
{ fail: '\x14' }
{ fail: '\x15' }
{ fail: '\x16' }
{ fail: '\x17' }
{ fail: '\x18' }
{ fail: '\x19' }
{ fail: '\x1A' }
{ fail: '\x1B' }
{ fail: '\x1C' }
{ fail: '\x1D' }
{ fail: '\x1E' }
{ fail: '\x1F' }
Reproducible By
https://gist.github.com/KhafraDev/cc178abba0a89580008f44c7ebee76a3
Sorry for the messy code it was taken almost directly from wpt (https://github.com/web-platform-tests/wpt/blob/master/fetch/api/headers/header-values-normalize.any.js)
Expected Behavior
The tests should pass.
Logs & Screenshots
Environment
Additional context
These checks were added in a29a151d to fix an issue where headers weren't being sanitized.
cc @mcollina
According to the spec, a header value is a string (or Uint8Array, etc) that has no leading or trailing whitespace and doesn't contain \0 or LF or CR.
Also see https://github.com/httpwg/http-core/issues/19#issuecomment-271903665
and the spec
The definition of header value is not defined in terms of an HTTP token production as it is broken
My understanding is that the use case of fetch in the browser vs in the server is different and it might be that those chars are safe to render on the wire from browsers but not from servers. However, that regexp might be too strict.
If you'd like to loose it a bit, it's fine as long as the tests pass.
The server is responding with the headers sent from fetch so there is an expectation that both the server and fetch can receive these headers. I will get around to fixing this, but wasn't entirely sure. 👍
I looked into it and there seems to be something preventing the request from being sent; I confirmed that the headers are being written to the socket, but can't figure out much more than that.
Somewhere along the lines 'HTTP/1.1 400 Bad Request\r\nConnection: close\r\n\r\n'
is being sent (maybe not to the socket?) and we end up with only a connection: close
header. No request is sent to the server.
Headers written to socket:
{
header: 'GET /?headers=val1|val2|val3 HTTP/1.1\r\n' +
'host: localhost:64375\r\n' +
'connection: keep-alive\r\n' +
'val1: \x01\r\n' +
'val2: x\x01\r\n' +
'val3: \x01x\r\n' +
'accept: */*\r\n' +
'accept-language: *\r\n' +
'sec-fetch-mode: cors\r\n' +
'user-agent: undici\r\n' +
'accept-encoding: gzip, deflate\r\n'
}
here: https://github.com/nodejs/undici/blob/733b3a0f0be9252067c94a4f9f522363405013e7/lib/client.js#L1401
diff
diff --git a/lib/core/request.js b/lib/core/request.js
index b05a16d..959b942 100644
--- a/lib/core/request.js
+++ b/lib/core/request.js
@@ -18,12 +18,11 @@ const util = require('./util')
const tokenRegExp = /^[\^_`a-zA-Z\-0-9!#$%&'*+.|~]+$/
/**
- * Matches if val contains an invalid field-vchar
- * field-value = *( field-content / obs-fold )
- * field-content = field-vchar [ 1*( SP / HTAB ) field-vchar ]
- * field-vchar = VCHAR / obs-text
+ * A header value is a byte sequence that matches the following conditions:
+ * * Has no leading or trailing HTTP tab or space bytes.
+ * * Contains no 0x00 (NUL) or HTTP newline bytes.
*/
-const headerCharRegex = /[^\t\x20-\x7e\x80-\xff]/
+const headerCharRegex = /^(\x09|\x20)|(\0|\r|\n)|(\x09|\x20)$/
// Verifies that a given path is valid does not contain control chars \x00 to \x20
const invalidPathRegex = /[^\u0021-\u00ff]/