arktype icon indicating copy to clipboard operation
arktype copied to clipboard

feat: Added string.base64.parse

Open Abion47 opened this issue 1 year ago • 1 comments

This PR adds the string.base64.parse keyword submodule which parses a base64 string into a Uint8Array.

  • [X] Code is up-to-date with the main branch
  • [ ] You've successfully run pnpm prChecks locally
  • [X] There are new or updated unit tests validating the change

Notes:

The parsing method for converting a base64 string into a Uint8Array is an adapted solution from the base64-js package since Buffer.from isn't available in browsers and btoa is notoriously slow.

I ran a NodeJS benchmark comparing this solution with btoa and Buffer.from, and got the following results (the test base64 string was a 31KB image):

┌─────────┬───────────┬──────────┬────────────────────┬──────────┬─────────┐
│ (index) │ Task Name │ ops/sec  │ Average Time (ns)  │ Margin   │ Samples │
├─────────┼───────────┼──────────┼────────────────────┼──────────┼─────────┤
│ 0       │ 'btoa'    │ '852'    │ 1173658.2565185907 │ '±0.27%' │ 51123   │
│ 1       │ 'buffer'  │ '3,448'  │ 289958.9299943234  │ '±0.11%' │ 206926  │
│ 2       │ 'b64-js'  │ '48,179' │ 20755.868277051777 │ '±1.26%' │ 2890749 │
└─────────┴───────────┴──────────┴────────────────────┴──────────┴─────────┘

The btoa result isn't surprising, but if these results are to be believed (and that's a huge "if"), the base64-js solution is an order of magnitude faster than Buffer.from. I'm not sure I trust that conclusion, but seeing as Buffer.from isn't an option anyway, I'm willing to take the win.

TODO:

  • After discussion regarding the API, implement string.base64.url.parse
    • Alternatively, have string.base64.parse handle both base64 and base64url strings (which is sort of already supported by the base64-js source).

Abion47 avatar Oct 02 '24 00:10 Abion47

Something I discovered while writing the tests is that I intended to base the tests on the TypedArray/Uint8 tests, but those tests don't appear to have been written yet. Furthermore, when I try to use the return value of b64parse(...) directly in attest(b64parse(...)).snap(...), it statically fails as snap appears to type the expected value as some kind of primitive array rather than correctly as a Uint8Array. I'm not sure if this is because I wrote either the module or the test wrong, but I've written the test to encode the value back into UTF8 to check the string value for now (which does pass).

Abion47 avatar Oct 02 '24 00:10 Abion47

Closing this for now, definitely interested in follow up though based on the discussion above!

ssalbdivad avatar Nov 11 '24 20:11 ssalbdivad