xxhash-wasm icon indicating copy to clipboard operation
xxhash-wasm copied to clipboard

Buffer output

Open ronag opened this issue 2 years ago • 1 comments

Would be nice to have an option to get the hash as a buffer for easy of use with e.g https://github.com/mafintosh/turbo-hash-map

ronag avatar Apr 08 '22 07:04 ronag

The hashes are uints (represented as number for uint32 and bigint for uint64 in JS) and at no point are they stored as anything other than uint. I am assuming the content of the buffer would be the memory representation of that uint, which you can create by writing the hash value into a buffer.

You first need to allocate enough space for the 32-bit (4 bytes) or 64-bit (8 bytes) hashes respectively, with Buffer.allocUnsafe and then you simply write the values into the buffer with buf.writeUint32 and buf.writeBigUint64BE, respectively.

// 32-bit hash, is a number and requires 4 bytes
const hash32 = h32(input);
const buf32 = Buffer.allocUnsafe(4);
buf32.writeUint32BE(value);

// 64-bit hash, is a bigint and requires 8 bytes
const hash64 = h64(input);
const buf64 = Buffer.allocUnsafe(8);
buf64.writeBigUint64BE(value);

Here, they are written as Big-Endian, and if necessary, there are also the Little-Endian versions (LE instead of BE in the method name) and if you need to handle them separately there is os.endianness to determine the endianness of the current platform, but if it's just for comparisons like in that hashmap, it probably doesn't matter since the values are never read from this buffer (as in the buffer being the memory which holds the bytes for the uint representations), so the byte order doesn't matter.

There is no advantage to have it integrated into this library, since there's no additional overhead compared to doing it yourself and because this library keeps everything compatible between Browser and Node, it wouldn't use Buffer directly and write them into a Uint8Array instead with DataView.prototype.setUint32() and DataView.prototype.setBigUint64(), which seems to be much slower than directly writing into the Node buffer (even when sharing the underlying array buffer with the Buffer, so it's really the writing that is slower), Interestingly, the DataView's setUint32 and setBigUint64 take roughly the same time, whereas with Node's Buffer the writeUint32BE is much faster than writeBigUint64BE.

jungomi avatar Apr 08 '22 22:04 jungomi