undici
undici copied to clipboard
perf(fetch): avoid redundant conversion
I'm not done yet.
We understand that some behaviors deviate from the specification and increase maintenance costs, but these APIs should be improved.
I'm not maintaining the web stuff anymore but if I did, I would be against anything that deviates from spec due to maintenance overhead. @KhafraDev
@KhafraDev I kept the changes to a minimum. Is this okay?
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 93.65%. Comparing base (
3d9fcf5) to head (798d62e).
Additional details and impacted files
@@ Coverage Diff @@
## main #2894 +/- ##
=======================================
Coverage 93.65% 93.65%
=======================================
Files 87 87
Lines 23899 23908 +9
=======================================
+ Hits 22383 22392 +9
Misses 1516 1516
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@KhafraDev https://github.com/tsctx/undici/commit/06e14d93e8e20a5c09f3e5cf6f44df9c748fe133 (The change is not reflected in the PR, see here.)
This PR objective is, avoid using the redundant operation String.toWellFormed.
When the passed string is passed to the text encoder api, invalid utf-8 characters are replaced and need not be used.
Removing these operations would increase performance by a factor of 1.5.
wow! This seems meaningful.
I would call it misleading. What is being improved by 1.5x? If it's instantiating Responses, the change is wrong (we have to convert the body to a USVString), and other changes are cold paths/indentation.
@KhafraDev This benchmark. https://github.com/oven-sh/bun/blob/main/bench/snippets/response-arrayBuffer.mjs
Ignore my previous comment, this will work, but in a not-so-obvious way. I'm not too sure if I like removing the explicit conversion to a USVString for the implicit conversion done in TextEncoder (and Buffer, as a side note), which took a little while to figure out. I would consider it working as a side-effect of how the body is extracted, and something that can easily break.
How the PR works now:
- A response is created with a string body such as '\ud801'
- The body is extracted in
extractBodywhich returns a stream that when pulled from, encodes the body with a TextEncoder - The TextEncoder replaces lone surrogates with u+fffd
How it worked before:
- A response is created with a string body such as '\ud801'
- The webidl converter in its constructor replaced lone surrogates characters with u+fffd
It's a noticeable difference when hunting down through multiple files why it worked. I (may falsely) assume that the reason there was no comment here was because you were confused as to why it worked as well?
A test should be added to ensure that if we ever change the TextEncoder approach, lone surrogates are still replaced. Afterwards I don't have much of an issue landing this.
const response = new Response('\ud801')
assert.deepStrictEqual(await response.text(), '\ufffd')
I think it's worthwhile landing anyway, the speed bump is noticeable.
Does it affects anyhow the fetch() bench?
Does it affects anyhow the fetch() bench?
Benchmarks that retrieve data show no performance improvement. Performance is improved only when sending strings.
Benchmark added. #2905
Can you rebase with main to have the benchmarks?
Is it ready for review? Should @KhafraDev review again?