protobuf-es
protobuf-es copied to clipboard
Use String.prototype.isWellFormed() once it's widely available
Protobuf requires strings to be valid UTF-8. When serializing, we check strings via encodeUriComponent, which is far from ideal for performance.
String.prototype.isWellFormed is a suitable alternative. On Node.js, it shows significantly better performance, especially for longer strings:
$ node --version
v24.5.0
$ node ./t.ts encodeURIComponent 10
node ./t.ts isWellFormed 10
node ./t.ts encodeURIComponent 100
node ./t.ts isWellFormed 100
node ./t.ts encodeURIComponent 1000
node ./t.ts isWellFormed 1000
encodeURIComponent with string length 10: 77.23291699999999 ms
isWellFormed with string length 10: 16.621958 ms
encodeURIComponent with string length 100: 34.113417 ms
isWellFormed with string length 100: 3.5224170000000044 ms
encodeURIComponent with string length 1000: 29.431917 ms
isWellFormed with string length 1000: 0.5034169999999989 ms
Benchmark script
// t.ts
const type = process.argv[2];
let checkUtf8: (str: string) => boolean;
switch (type) {
case "encodeURIComponent":
checkUtf8 = function checkUtf8(str: string) {
try {
encodeURIComponent(str);
return true;
} catch (_) {
return false;
}
};
break;
case "isWellFormed":
checkUtf8 = function checkUtf8(str: string) {
// @ts-expect-error
return str.isWellFormed();
};
break;
default:
throw new Error("Unknown type: " + type);
}
const strLen = process.argv[3];
let strings: string[];
switch (strLen) {
case "10":
strings = new Array(1_000_000).fill("012345678¼");
break;
case "100":
strings = new Array(100_000).fill("012345678¼".repeat(10));
break;
case "1000":
strings = new Array(10_000).fill("012345678¼".repeat(100));
break;
default:
throw new Error("Unknown strLen: " + strLen);
}
const start = performance.now();
for (const str of strings) {
if (!checkUtf8(str) ) {
throw new Error(`Unexpected invalid utf-8 ${str}`);
}
}
const elapsed = performance.now() - start;
console.log(`${type} with string length ${strLen}: ${elapsed} ms`);
isWellFormed is not widely available yet, but it will be in April 2026. See the definition of "widely available" on MDN.
Related to: https://github.com/bufbuild/protobuf-es/issues/333