data.jsdelivr.com icon indicating copy to clipboard operation
data.jsdelivr.com copied to clipboard

Allow base64 encoded sha256 hash to be looked up on `/v1/lookup/hash/{hash}`

Open 0xdevalias opened this issue 6 months ago • 3 comments

Currently the lookup API endpoint allows specifying a hex-encoded sha256 hash to be looked up:

  • https://www.jsdelivr.com/docs/data.jsdelivr.com#tag--Lookup

But looking at some of the other API endpoints, in their responses they provide base64 encoded sha256 hashes, eg.

  • https://www.jsdelivr.com/docs/data.jsdelivr.com#get-/v1/packages/npm/-package-@-version-
    • hash*: string: A base64-encoded sha256 of file contents.

While I could obviously write some glue code to convert this; it would be nice if we were able to provide the base64 encoded version directly; with the API either automatically detecting the hash type, or even if we had to specify an extra param to tell it which encoding we're providing.

0xdevalias avatar Jun 18 '25 06:06 0xdevalias

It looks like this is the file that handles the lookup:

https://github.com/jsdelivr/data.jsdelivr.com/blob/46389da6bcbf5ae5c153e03ba7e00cbfd2e5299b/src/routes/v1/LookupRequest.js#L6-L8

Which then calls into this to do the actual lookup:

https://github.com/jsdelivr/data.jsdelivr.com/blob/46389da6bcbf5ae5c153e03ba7e00cbfd2e5299b/src/models/File.js#L47-L59

A contrived example of how an auto-detecting version of this might look could be:

function tryDecodeHash(input: string): Buffer {
  const isHex = /^[0-9a-fA-F]+$/.test(input) && input.length % 2 === 0;
  if (isHex) return Buffer.from(input, 'hex');

  try {
    const buf = Buffer.from(input, 'base64');
    // optional: check that decoded buffer is 32 bytes (SHA-256 length)
    if (buf.length === 32) return buf;
  } catch {}

  throw new Error('Invalid hash format: expected hex or base64');
}

const hashBuffer = tryDecodeHash(this.params.hash);
let file = await File.getBySha256(hashBuffer);

0xdevalias avatar Jun 18 '25 06:06 0xdevalias

While it isn't too hard to do the auto-detection, I'm generally against adding options to the API where the client can easily handle it itself, as it reduces our ability to cache things at the CDN level. Also, the reason for using sha256 in this case is that base64 contains / characters, which many clients and OpenAPI tools don't expect in the case of path variables.

MartinKolarik avatar Jun 18 '25 12:06 MartinKolarik

@MartinKolarik Yeah, that's fair enough.

Handling it myself is fairly negligible when writing code; the main reason I hit it today was while manually exploring the API; and it feeling weird that I couldn't use the same values being provided in the responses directly in the lookup.

0xdevalias avatar Jun 18 '25 13:06 0xdevalias