oxc icon indicating copy to clipboard operation
oxc copied to clipboard

fix(minifier): fix string concatenation for lone surrogates

Open h-a-n-a opened this issue 4 months ago • 5 comments

Fixes a simple use case in optimizing lone surrogates. Also initialized and pinpointed some places that need a fix in future PRs.

With input:

'\uD83D' + '\uDD25'
`\uD83D` + `\uDD25`

Previously outputs:

"�d838�dd25"
"�d838�dd25"

Now outputs:

"\ud83d\udd25"
"\ud83d\udd25"

Ported some optimization fixes I made in https://github.com/swc-project/swc/pull/10987 and I'm really appreciate the job you all have done. Without the previous investigation made by @overlookmotel, @Boshen and Oxc team, this will not happen.

h-a-n-a avatar Aug 20 '25 11:08 h-a-n-a

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

graphite-app[bot] avatar Aug 20 '25 11:08 graphite-app[bot]

This is a lot of code to support lone surrogates, I'll take a deeper look later.

Boshen avatar Aug 20 '25 11:08 Boshen

CodSpeed Instrumentation Performance Report

Merging #13229 will not alter performance

Comparing h-a-n-a:lone-surrogates-optimization (f1b2f8c) with main (d27a04b)[^unexpected-base] [^unexpected-base]: No successful run was found on main (e84ae8b) during the generation of this report, so d27a04b was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

Summary

✅ 34 untouched benchmarks

codspeed-hq[bot] avatar Aug 20 '25 11:08 codspeed-hq[bot]

This is a lot of code to support lone surrogates, I'll take a deeper look later.

@Boshen Thanks! That would be great and I'd love to fix all the TODOs I listed in the PR, probably in the future PRs. If you need any help, just let me know ;-)

h-a-n-a avatar Aug 20 '25 11:08 h-a-n-a

FYI: we've replaced our implementation here in swc with Wtf8Buf.

See:

  1. https://github.com/swc-project/swc/pull/11104
  2. https://github.com/swc-project/swc/pull/11144

h-a-n-a avatar Nov 10 '25 11:11 h-a-n-a