buffer
buffer copied to clipboard
Use TextDecoder for toString('utf8')
Closes #268
This uses TextDecoder for toString('utf8') and toString().
I needed to update some tests so that they are in line with Node's native Buffer (which also makes them pass with TextDecoder), I hope this was correct?
Technically, it also supports latin1, utf-16le, but the conversion is different from Node for strings that aren't representable in these encodings:
latin1:
Buffer content: 'Ö' = <UTF8 Buffer c3 96>
Output:
TextDecoder latin1: 'Ö' <Buffer c3 13>
Node Buffer latin1: 'Ã' <Buffer c3 96>
utf16: (TextDecoder adds an "�" at the end. https://www.compart.com/en/unicode/U+FFFD)
Buffer content: 'abc' = <UTF8 Buffer 61 62 63>
Output:
TextDecoder utf-16le: e6 89 a1 ef bf bd
Node utf16le: e6 89 a1
Great initiative @mischnic. I am using Buffer as a drop-in replacement for Node's versions. Changing the tests wouldn't work for me as then Buffer couldn't be used for that. Would it be possible to adjust the output of decoderUTF8.decode(buf.slice(start, end)) to adjust for this case? Maybe remove the "�" with utf16le and replace 0x13 with 0x96 with lading encoding?
I only adjusted those test which were apparently deviating from Node's Buffer. For example try running this
> new Buffer([0xF4, 0x8F, 0x80]).toString().length
1
so this test was apparently wrong
t.equal(
new B([0xF4, 0x8F, 0x80]).toString(),
'\uFFFD\uFFFD\uFFFD'
)
I only used TextDecoder for utf8 because it seems to align with Buffer.toString("utf8"). The handling of the other encodings (utf16, latin) is still the same.
@feross ?
There is apparently some breakeven point where TextDecoder becomes faster then the existing implementation:
Using node perf/readUtf8.js, testing 256 byte buffers and new Buffer('7c'.repeat(5e7), 'hex') for the "big" variants
master:
BrowserBuffer#readUtf8 x 414,259 ops/sec ±2.91% (85 runs sampled)
NodeBuffer#readUtf8 x 486,114 ops/sec ±3.01% (84 runs sampled)
BrowserBuffer#readUtf8 big x 0.98 ops/sec ±5.58% (7 runs sampled)
NodeBuffer#readUtf8 big x 34.31 ops/sec ±1.56% (58 runs sampled)
this:
BrowserBuffer#readUtf8 x 195,525 ops/sec ±2.06% (86 runs sampled)
NodeBuffer#readUtf8 x 486,587 ops/sec ±2.24% (79 runs sampled)
BrowserBuffer#readUtf8 big x 18.61 ops/sec ±11.77% (38 runs sampled)
NodeBuffer#readUtf8 big x 35.19 ops/sec ±1.76% (61 runs sampled)