node-bencode
node-bencode copied to clipboard
`node-bencode` can produce dictionary entries with duplicate keys.
Bug
node-bencode can produce dictionary entries with duplicate keys.
node-bencode assumes that binary string keys made out of unique Javascript string keys are unique as well, which is false.
https://github.com/webtorrent/node-bencode/blob/ee70f267c8d34b9a94820ca8c42cd67d1274fc89/lib/encode.js#L53-L55 https://github.com/ThaUnknown/uint8-util/blob/149c44c010b3ad17a7904c4266545bbca1fd4403/_node.js#L13
encode.string = function (buffers, data) {
buffers.push(text2arr(text2arr(data).byteLength + ':' + data))
}
export const text2arr = str => new Uint8Array(Buffer.from(str, 'utf8'))
Proof-of-concept
For example, let node-bencode try encoding {"\uD800": 1, "\uDFFF": 2}. It’ll produce dictionary entries with the duplicate key, "3:\xEF\xBF\xBD".
const lone_surrogates = "\uD800\uDFFF";
// Lone (“unmatched”) UTF-16 surrogates. Invalid in UTF-16.
const a = Buffer.from(lone_surrogates[0], "UTF-8");
const b = Buffer.from(lone_surrogates[1], "UTF-8");
// Decoding the Javascript strings in UTF-16 and encoding them into UTF-8.
console.log(a, a.toString(), b, b.toString());
// Since those Javascript strings are invalid in UTF-16,
// those lone surrogates are decoded
// into `REPLACEMENT CHARACTER`s (U+FFFD)
// and subsequently encoded into `<Buffer ef bf bd>`.
// Meaning,
console.log(a.equals(b));
// is true, when (lone_surrogates[0] === lone_surrogates[1]) is false.
oooh i was mentioned, yup, i have no clue what i'm looking at
Since Buffer.from("\uD800").equals(Buffer.from("\uDFFF")), node-bencode can produce multiple dictionary entries with the same key, which is invalid.