node icon indicating copy to clipboard operation
node copied to clipboard

zlib: expose zlib.crc32()

Open joyeecheung opened this issue 1 month ago • 12 comments

This patch exposes the crc32() function from zlib to user-land.

It computes a 32-bit Cyclic Redundancy Check checksum of data. If value is specified, it is used as the starting value of the checksum, otherwise, 0 is used as the starting value.

The CRC algorithm is designed to compute checksums and to detect error in data transmission. It's not suitable for cryptographic authentication.

To be consistent with other APIs, if the data is a string, it will be encoded with UTF-8 before being used for computation. If users only use Node.js to compute and match the checksums, this works well with other APIs that uses the UTF-8 encoding by default.

Some third-party JavaScript libraries compute the checksum on a string based on str.charCodeAt() so that it can be run in browsers. If users want to match the checksum computed with this kind of library in the browser, it's better to use the same library in Node.js if it also runs in Node.js. If users have to use zlib.crc32() to match the checksum produced by such a third-party library:

  1. If the library accepts Uint8Array as input, use TextEncoder in the browser to encode the string into a Uint8Array with UTF-8 encoding, and compute the checksum based on the UTF-8 encoded string in the browser.
  2. If the library only takes a string and compute the data based on str.charCodeAt(), on the Node.js side, convert the string into a buffer using Buffer.from(str, 'utf16le').
const zlib = require('node:zlib');
const { Buffer } = require('node:buffer');

let crc = zlib.crc32('hello');  // 907060870
crc = zlib.crc32('world', crc);  // 4192936109

crc = zlib.crc32(Buffer.from('hello', 'utf16le'));  // 1427272415
crc = zlib.crc32(Buffer.from('world', 'utf16le'), crc);  // 4150509955

NOTE: it's exposed in zlib because that's also what Python and Ruby do.

joyeecheung avatar Apr 25 '24 17:04 joyeecheung

Wouldn't it be better to expose it under crypto?

Edit: On second thought, maybe not.

lpinca avatar Apr 25 '24 19:04 lpinca

Wouldn't it be better to expose it under crypto?

It's not a cryptographic hash, but a checksum. I was thinking about copying the python documentation (python also puts it on zlib):

The algorithm is not cryptographically strong, and should not be used for authentication or digital signatures. Since the algorithm is designed for use as a checksum algorithm, it is not suitable for use as a general hash algorithm.

(Not sure if it's okay to copy other documentation, but if we have to write one from scratch it wouldn't be too different anyway?)

joyeecheung avatar Apr 25 '24 19:04 joyeecheung

Wouldn't it be better to expose it under crypto?

Edit: On second thought, maybe not.

I had the exact same thought process.

The algorithm is not cryptographically strong, and should not be used for authentication or digital signatures. Since the algorithm is designed for use as a checksum algorithm, it is not suitable for use as a general hash algorithm.

I agree this documentation bit or something to its effect is necessary to circumvent the same brain exercises myself and @lpinca went through :)

panva avatar Apr 26 '24 09:04 panva

The doc currently puts it as

The CRC algorithm is designed to compute checksums and to detect error in data transmission. It's not suitable for cryptographic authentication.

Also, note that both Python and Ruby puts crc32 under a zlib module

joyeecheung avatar Apr 26 '24 09:04 joyeecheung

Failed to start CI
   ⚠  Something was pushed to the Pull Request branch since the last approving review.
   ✘  Refusing to run CI on potentially unsafe PR
https://github.com/nodejs/node/actions/runs/8848132665

github-actions[bot] avatar Apr 26 '24 12:04 github-actions[bot]

huh, this is...amusing. I'll start from Jenkins.

joyeecheung avatar Apr 26 '24 12:04 joyeecheung

CI: https://ci.nodejs.org/job/node-test-pull-request/58716/

nodejs-github-bot avatar Apr 26 '24 12:04 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/58721/

nodejs-github-bot avatar Apr 26 '24 17:04 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/58741/

nodejs-github-bot avatar Apr 27 '24 08:04 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/58768/

nodejs-github-bot avatar Apr 28 '24 05:04 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/58803/

nodejs-github-bot avatar Apr 29 '24 18:04 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/58821/

nodejs-github-bot avatar Apr 30 '24 15:04 nodejs-github-bot

Updated to run clang-format on the code. Can I get a re-approval to land it? @benjamingr @jasnell @addaleax @lpinca @anonrig @VoltrexKeyva

joyeecheung avatar May 01 '24 16:05 joyeecheung

Landed in 2cfabb11e161fe3607e62951c4cd1311f214ac61

nodejs-github-bot avatar May 02 '24 12:05 nodejs-github-bot