software-layer icon indicating copy to clipboard operation
software-layer copied to clipboard

[WIP] update licenses

Open MartinsNadia opened this issue 1 year ago • 7 comments

Open the new one to fix the origin branch

MartinsNadia avatar Aug 19 '24 14:08 MartinsNadia

Instance eessi-bot-mc-aws is configured to build for:

  • architectures: x86_64/generic, x86_64/intel/haswell, x86_64/intel/skylake_avx512, x86_64/amd/zen2, x86_64/amd/zen3, aarch64/generic, aarch64/neoverse_n1, aarch64/neoverse_v1
  • repositories: eessi.io-2023.06-compat, eessi-hpc.org-2023.06-software, eessi-hpc.org-2023.06-compat, eessi.io-2023.06-software

eessi-bot[bot] avatar Aug 19 '24 14:08 eessi-bot[bot]

Instance eessi-bot-mc-azure is configured to build for:

  • architectures: x86_64/amd/zen4
  • repositories: eessi.io-2023.06-compat, eessi-hpc.org-2023.06-compat, eessi-hpc.org-2023.06-software, eessi.io-2023.06-software

eessi-bot[bot] avatar Aug 19 '24 14:08 eessi-bot[bot]

WIP

adding a new licenses update script PR 457; adding draft yml file

MartinsNadia avatar Aug 19 '24 14:08 MartinsNadia

spare thoughts right now:

  • i kept the retrieval date because the original script did so, but i dont see much sense in keeping it, I would keep the license url instead
  • there are a lot of not found but i expect them to be less of them as right now some source_urls are still formatted with EB syntax but it my priority to fix it tomorrow
  • need to fix the "needs manual action" part as its case sensitive right now with the "Other" licenses

hvelab avatar Feb 27 '25 18:02 hvelab

Updates:

  • Fixed bug and now finds more licenses
  • Now shows the real api call from where it got the license
  • For the "Other" and "not found", does an scraping to go find the LICENSE file -> we need to find a way to retrieve the spdx from there
  • This needs to be improved for the packages with "Others"
  • For the totally "not found", shows source_urls and homepages, seems that most of them need to be sanitized or do the scraping from there, add more keywords

hvelab avatar Mar 03 '25 16:03 hvelab

finished doing manual checks (while watching some random movie on the TV), we would just need permission for PGPLOT (https://sites.astro.caltech.edu/~tjp/pgplot/#copyright), and for CUDA im not sure if the propietary license allows us to redistribute it the scraper (aside from update_licenses.json) is good for an initial populate but to cover every single case is very difficult and required much more manual checks than expected, at least is done for now, for all software available in EESSI excluding extensions, which is my next focus, such as enforcing specifying the license from EB files

could try from a hook and if it fails then request the manual action? again, it's very difficult to scrape it from webpages and sometimes requires human interpretation

hvelab avatar Mar 11 '25 20:03 hvelab

@ocaisa should this be closed and moved to https://github.com/EESSI/software-layer-scripts?

laraPPr avatar Jun 27 '25 13:06 laraPPr