adoptium.net icon indicating copy to clipboard operation
adoptium.net copied to clipboard

Track download metrics for JFrog artifacts

Open jiekang opened this issue 1 year ago • 4 comments

JFrog collects download data per artifact that is visible in their web UI:

For example: https://packages.adoptium.net/ui/repos/tree/General/rpm/fedora/38/x86_64/Packages/temurin-11-jdk-11.0.20.1.0.1-1.x86_64.rpm

It would be nice to have this data collected and collated together as part of the Adoptium dashboard.

jiekang avatar Apr 12 '24 14:04 jiekang

I will note that we have introduced a Fastly CDN in front of JFrog to take some of the load off, so we need to include stats from that too to be realistic.

I was looking at the Fastly API to see if we can gather package download numbers from there. The stats look quite generic and not oriented to the actual artefacts, which may be a function of the fact they just cache http responses without knowing much about the underlying files etc. So not sure what we could usefully use to gather a ‘daily download by JDK version etc’ statistic like we do for github and docker.

However, it may be that we present some other interesting trends, such as the number of requests and the number of bytes moved, which is a 'proxy' for the actual number of packages. Here's a screen grab for the CDN activity in March. As you can see it is shading 95% of calls from going through to JFrog.

Screenshot 2024-04-18 at 09 21 50

tellison avatar Apr 18 '24 08:04 tellison

https://jfrog.com/help/r/jfrog-rest-apis/get-the-open-metrics-for-artifactory

  • jfsh_binaries_download_successes_total
    • not sure if this differentiates between metadata and actual binary downloads or not, need to query to see

smlambert avatar May 17 '24 17:05 smlambert

With the switch to Cloudflare, there is a way to query downloads. @tellison has prototyped it here: https://gist.github.com/tellison/e2736c65cf74c8f9c82d2b83f43ade16

One can query via curl (see example in docs at https://developers.cloudflare.com/analytics/graphql-api/getting-started/execute-graphql-query/)

smlambert avatar Nov 15 '24 13:11 smlambert

As Shelley noted, things have become simpler since April since all our traffic to packages.adoptium.net now goes trough CloudFlare and we have a method for tracking the downloads at that level.

IMHO the correct order would be to update the Adoptium API to include the package download stats first, then update the dashboard code to show the numbers. I have opened an issue to cover updating the API.

tellison avatar Nov 18 '24 15:11 tellison