Wrong or insufficient result on API get /v1/lookup/hash/{hash}
Description
Description: When calling the API-Endpoint /v1/lookup/hash/{hash} with the filehash of jquery 3.7.1 i get back a wrong result or at least not the npm result.
Steps to Reproduce: To make sure we got the right filehash we will call anothre API-Endpoint for query 3.7.1: https://data.jsdelivr.com/v1/packages/npm/[email protected]
Result(partly):
{ "type": "file", "name": "jquery.min.js", "hash": "/JqT3SQfawRcv/BIHPThkBvs0OEvtFFmqPF/lYI/Cxo=", "size": 87533 }
Base64 -> Hex: hash=fc9a93dd241f6b045cbff0481cf4e1901becd0e12fb45166a8f17f95823f0b1a
Use this hash in the other endpoint: https://data.jsdelivr.com/v1/lookup/hash/fc9a93dd241f6b045cbff0481cf4e1901becd0e12fb45166a8f17f95823f0b1a
Result:
{ "type": "gh", "name": "madmaxchow/VLOOK", "version": "master", "file": "/docs/js/jquery.js" }
So the Result points to the identical file in some other GitHub Repository in the docs section instead of npm source
Expected behavior: Search via API with a filehash e.g. https://data.jsdelivr.com/v1/lookup/hash/fc9a93dd241f6b045cbff0481cf4e1901becd0e12fb45166a8f17f95823f0b1a and get back the npm source or atleast all occurences
Actual Behavior: Search via API with a filehash e.g. https://data.jsdelivr.com/v1/lookup/hash/fc9a93dd241f6b045cbff0481cf4e1901becd0e12fb45166a8f17f95823f0b1a and get back source in random GitHub repository
Affected jsDelivr links
https://data.jsdelivr.com/v1/lookup/hash/fc9a93dd241f6b045cbff0481cf4e1901becd0e12fb45166a8f17f95823f0b1a https://data.jsdelivr.com/v1/packages/npm/[email protected]
Response headers
accept-ranges: bytes access-control-allow-origin: * access-control-expose-headers: * age: 300684 alt-svc: h3=":443";ma=86400,h3-29=":443";ma=86400,h3-27=":443";ma=86400 cache-control: public, max-age=31536000, stale-while-revalidate=86400, stale-if-error=86400 cf-cache-status: DYNAMIC cf-ray: 92713a82bf0971c5-FRA content-encoding: br content-length: 86 content-type: application/json; charset=utf-8 cross-origin-resource-policy: cross-origin date: Tue, 08 Apr 2025 14:13:13 GMT etag: W/"63-TDFDeXPyw9SXIRq67li0Y32jaIk" rndr-id: 1d8d7af3-2eb2-4c1e server: cloudflare timing-allow-origin: * vary: Accept-Encoding, Accept-Encoding via: 1.1 varnish x-cache: HIT x-render-origin-server: Render x-response-time: 8ms x-robots-tag: noindex x-served-by: cache-fra-eddf8230141-FRA x-timer: S1744121593.995989,VS0,VE4
Information
- Device OS: Ubuntu 22
- Commandline
- Your location: Germany
Requisites
- [x] I performed a cursory search of the issue tracker to avoid opening a duplicate issue.
- [x] I checked the documentation to understand that the issue I am reporting is not normal behavior.
- [x] I understand that not filling out this template correctly will lead to the issue being closed.
Additional content
No response
Hey, we'll consider possible improvements here, but the current behavior matches the documentation:
Allows a reverse lookup of a file at the CDN by its hash. Works only for files which were accessed at least once. If there are multiple files with the same hash, only the one which was accessed first via the CDN is returned.
Can you please better describe your use case?
Hey, i am happy to describe my use case here. I get artifacts from various sources that were created without package.json or similar. Now I need the exact dependencies for the creation of an SBOM. This process is incredibly time-consuming and difficult without this search via the file hash (and above all cannot be automated).
We could maybe add an option to list all packages that have the file instead of returning just one. But then you'd need to somehow select the "right" one, which might still be hard (there are 20+ matches in this case).
Tell me if i am wrong, but if i only want to look at npm first, this should be unique right?
No, there can be several matching npm packages as well.
Okay, than it could be hard to figure out the right package, but should be possible somehow. Could you provide me with an example? Maybei couldfigure out if we would be able to use a full list in our usecase.
Here sample results for fc9a93dd241f6b045cbff0481cf4e1901becd0e12fb45166a8f17f95823f0b1a
gh,madmaxchow/VLOOK,master,/docs/js/jquery.js
gh,mixice/uigg,master,/js/jquery.min.js
gh,appotry/hexo,master,/libs/jquery/jquery.min.js
gh,jquery/jquery-dist,main,/dist/jquery.min.js
npm,jquery,3.7.1,/dist/jquery.min.js
gh,jquery/jquery,3.7.1,/dist/jquery.min.js
gh,jquery/jquery-dist,3.7.1,/dist/jquery.min.js
gh,fxxk3rrth4ng/utils,master,/jquery.js
gh,yi-yunseok/Yi-Yunseok.github.io,master,/assets/js/vendors/jquery-3.7.1.min.js
gh,jquery/jquery,f79d5f1a337528940ab7029d4f8bbba72326f269,/dist/jquery.min.js
npm,beefup,1.4.11,/dist/js/jquery.min.js
gh,willsofts/will-asset,1.0.0,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.1,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.2,/jquery/jquery-3.7.1.min.js
npm,jquery,3,/dist/jquery.min.js
gh,willsofts/will-asset,1.0.3,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.4,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.5,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.6,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.7,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.8,/jquery/jquery-3.7.1.min.js
npm,beefup,1.4.12,/dist/js/jquery.min.js
gh,willsofts/will-asset,1.0.9,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.10,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.11,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.12,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.13,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.14,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.16,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.17,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.18,/jquery/jquery-3.7.1.min.js
npm,@liaojie1314/blog-static,1.0.0,/js/jquery3.7.1/jquery.min.js
npm,beefup,1.4.13,/dist/js/jquery.min.js
npm,webfast,0.1.25,/content/js/jquery-3.7.1.min.js
npm,webfast,0.1.24,/content/js/jquery-3.7.1.min.js
npm,webfast,0.1.28,/content/js/jquery-3.7.1.min.js
npm,webfast,0.1.21,/content/js/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.19,/jquery/jquery-3.7.1.min.js
gh,AxKuu/jquery,3.7.1,/jquery-3.7.1.min.js
gh,davidjbradshaw/iframe-resizer,4.3.11,/example/jquery-3.7.1.min.js
npm,bsseond,1.0.0,/jquery-3.7.1.min.js
gh,jd82k/Joe,1.2.1,/assets/libs/jquery/jquery.min.js
gh,bonafide-ngo/jquery,3.7.1b0,/dist/jquery.min.js
npm,beefup,1.4.14,/dist/js/jquery.min.js
gh,multitalentedman/responsive-blog-design,275cb44ca8ab3246e9f3690b81796f7531085f12,/shared/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.21,/jquery/jquery-3.7.1.min.js
npm,liutsnpm,1.0.1,/Joe-assets/assets/libs/jquery/jquery.min.js
npm,liutsnpm,1.0.2,/Joe-assets/assets/libs/jquery/jquery.min.js
gh,huxubo/CDN,0.0.1,/libs/jquery/3.7.1/jquery.min.js
gh,willsofts/will-asset,1.0.23,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.24,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.26,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.27,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.28,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.29,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.31,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.33,/jquery/jquery-3.7.1.min.js
gh,appotry/hexo,10.9,/libs/jquery/jquery.min.js
gh,appotry/hexo,10.13,/libs/jquery/jquery.min.js
gh,appotry/hexo,10.14,/libs/jquery/jquery.min.js
gh,usdos-cgfs/audit-tool,1.0.1,/lib/jquery-3.7.1.min.js
npm,abeamer,1.7.2,/client/lib/js/vendor/jquery-3.7.1.min.js
gh,usdos-cgfs/audit-tool,1.0.2,/lib/jquery-3.7.1.min.js
gh,UniversitaDellaCalabria/unicms-template-italia,1.3.1,/src/unicms_template_italia/static/js/jquery.3.7.1.min.js
gh,appotry/hexo,10.18,/libs/jquery/jquery.min.js
gh,appotry/hexo,10.20,/libs/jquery/jquery.min.js
gh,appotry/hexo,10.21,/libs/jquery/jquery.min.js
gh,appotry/hexo,10.22,/libs/jquery/jquery.min.js
gh,appotry/hexo,10.23,/libs/jquery/jquery.min.js
gh,appotry/hexo,10.25,/libs/jquery/jquery.min.js
gh,James-JohnsonBE/demo,1.0.2,/lib/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.37,/jquery/jquery-3.7.1.min.js
gh,eyeofchaos/eocjsNewsticker,0.7.3,/jquery-3.7.1.min.js
gh,UniversitaDellaCalabria/unicms-template-italia,1.3.2,/src/unicms_template_italia/static/js/jquery.3.7.1.min.js
gh,UniversitaDellaCalabria/unicms-template-italia,1.3.3,/src/unicms_template_italia/static/js/jquery.3.7.1.min.js
gh,willsofts/will-asset,1.0.38,/jquery/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.39,/jquery/jquery-3.7.1.min.js
gh,django/django,5.1.4,/django/contrib/admin/static/admin/js/vendor/jquery/jquery.min.js
npm,beefup,1.4.15,/dist/js/jquery.min.js
npm,beefup,1.5.0,/dist/js/jquery.min.js
npm,qexo-static,3.0.4,/qexo/jquery/jquery.min.js
npm,qexo-static,3.0.5,/qexo/jquery/jquery.min.js
npm,qexo-static,3.0.2,/qexo/jquery/jquery.min.js
npm,qexo-static,3.0.3,/qexo/jquery/jquery.min.js
npm,qexo-static,3.0.1,/qexo/jquery/jquery.min.js
gh,django/django,5.0,/django/contrib/admin/static/admin/js/vendor/jquery/jquery.min.js
npm,jsxgraph,1.10.1,/distrib/docs/static/jquery.min.js
gh,janssenproject/jans,76e0414143c7d3df6285566983fbd06f967fb715,/jans-casa/plugins/bioid/extras/agama/web/js/jquery-3.7.1.min.js
gh,janssenproject/jans,43f18a79de5b5d673456d0f66720771692760d03,/jans-casa/plugins/bioid/extras/agama/web/js/jquery-3.7.1.min.js
gh,willsofts/will-asset,1.0.40,/jquery/jquery-3.7.1.min.js
gh,django/django,5.1.6,/django/contrib/admin/static/admin/js/vendor/jquery/jquery.min.js
gh,lkzhao/Hero,1.6.4,/docs/docsets/Hero.docset/Contents/Resources/Documents/js/jquery.min.js
gh,lkzhao/Hero,1.6.4,/docs/js/jquery.min.js
npm,tornado-cdn,2.0.3,/jquery.min.js
npm,jquery.flipster,1.1.6,/demo/jquery.min.js
gh,emasgp/js,c21f6b197c3a4b4e281c22da043819479ffe53e0,/jquery-3.7.1min.js
gh,geonetwork/core-geonetwork,ae96904af72a3fef40fb8deeded6f56d2d28b746,/web-ui/src/main/resources/catalog/lib/jquery-3.7.1.min.js
You are totally right, to differentiate between some f them will be pretty hard :(
okay i think i found a solution for my usecase if you could provide an endpoint with the full list. I can filter for npm entries. After that i just have to lookup via npm the release timestamp of that explicit version and take the oldest one.
@MartinKolarik should i close this bug-issue (or better creator-cant-read-issue xD) and make a feature request?
No need to create a new issue, I'll take a look at this when I get some time.
We could maybe add an option to list all packages that have the file instead of returning just one.
@MartinKolarik Personally I would find that useful/interesting; more so than just getting a single match for whatever the first source that happened to be accessed was.
It would also be potentially useful to be able to filter those down with a type param (similar to what other API endpoints have that let me specify npm / gh / etc)
I'm not sure how much this would complicate things, but maybe that could also be sorted by some kind of 'popularity' measure like download stats/etc.
For the example use case, I could probably guess that the intended version was the main npm version; but for other libraries I might not be as easily able to identify what the 'main canonical source' might be for that file; which is where I might be able to use the 'popularity' to help narrow it down (eg. if most of those results have 100/1000's of downloads, and then the main one has 1,000,000's of downloads; I could make a solid guess)
A more complicated idea (that I'm not even sure if it would be viable), but that maybe could help figure out the 'canonical' version better, might be to:
- lookup the hash and get a list of matching projects that have that file
- for each of those projects, check the
package.json/ similar to see if this file is included in the main exports for that package
It looks like this is the file that handles the lookup:
https://github.com/jsdelivr/data.jsdelivr.com/blob/46389da6bcbf5ae5c153e03ba7e00cbfd2e5299b/src/routes/v1/LookupRequest.js#L6-L8
Which then calls into this to do the actual lookup:
https://github.com/jsdelivr/data.jsdelivr.com/blob/46389da6bcbf5ae5c153e03ba7e00cbfd2e5299b/src/models/File.js#L47-L59
So maybe to sort by downloads it could join against something like (from a quick/naive skim):
- https://github.com/jsdelivr/data.jsdelivr.com/blob/master/src/models/PackageHits.js
- https://github.com/jsdelivr/data.jsdelivr.com/blob/master/src/models/PackageVersionHits.js
- https://github.com/jsdelivr/data.jsdelivr.com/blob/master/src/models/FileHits.js
- etc