purldb icon indicating copy to clipboard operation
purldb copied to clipboard

Enhance License Detection and Filename Handling for Github /api/collect/

Open chinyeungli opened this issue 7 months ago • 1 comments

purl: pkg:github/adobe/[email protected]

The /api/collect/ for the above purl return the following:

[
  {
    "url": "http://127.0.0.1:8001/api/packages/a89afb01-95f2-4772-89ea-1b0bfe9c5414/",
    "uuid": "a89afb01-95f2-4772-89ea-1b0bfe9c5414",
    "filename": "v0.9.1.tar.gz",
    "package_sets": [
      {
        "uuid": "d1aa0772-5229-474b-af0d-67aab85d8f70",
        "packages": [
          "http://127.0.0.1:8001/api/packages/a89afb01-95f2-4772-89ea-1b0bfe9c5414/",
          "http://127.0.0.1:8001/api/packages/43ad836e-08d0-4895-bfd4-1de8a77f3141/"
        ]
      }
    ],
    "package_content": null,
    "purl": "pkg:github/adobe/[email protected]",
    "type": "github",
    "namespace": "adobe",
    "name": "aa-client-go",
    "version": "0.9.1",
    "qualifiers": "",
    "subpath": "",
    "primary_language": "Go",
    "description": null,
    "release_date": "2021-08-18T08:59:53Z",
    "parties": [],
    "keywords": [],
    "homepage_url": null,
    "download_url": "https://github.com/adobe/aa-client-go/archive/refs/tags/v0.9.1.tar.gz",
    "bug_tracking_url": "https://github.com/adobe/aa-client-go/issues",
    "code_view_url": "https://github.com/adobe/aa-client-go",
    "vcs_url": "git://github.com/adobe/aa-client-go.git",
    "repository_homepage_url": null,
    "repository_download_url": null,
    "api_data_url": null,
    "size": null,
    "md5": null,
    "sha1": null,
    "sha256": null,
    "sha512": null,
    "copyright": null,
    "holder": null,
    "declared_license_expression": null,
    "declared_license_expression_spdx": null,
    "license_detections": [],
    "other_license_expression": null,
    "other_license_expression_spdx": null,
    "other_license_detections": [],
    "extracted_license_statement": null,
    "notice_text": null,
    "source_packages": [],
    "extra_data": {},
    "package_uid": "pkg:github/adobe/[email protected]?uuid=a89afb01-95f2-4772-89ea-1b0bfe9c5414",
    "datasource_id": null,
    "file_references": [],
    "dependencies": [],
    "resources": "http://127.0.0.1:8001/api/packages/a89afb01-95f2-4772-89ea-1b0bfe9c5414/resources/",
    "history": "http://127.0.0.1:8001/api/packages/a89afb01-95f2-4772-89ea-1b0bfe9c5414/history/"
  }
]

The tool should fetch the license information from github's API ( https://api.github.com/repos/adobe/aa-client-go ) i.e.

  "license": {
    "key": "apache-2.0",
    "name": "Apache License 2.0",
    "spdx_id": "Apache-2.0",
    "url": "https://api.github.com/licenses/apache-2.0",
    "node_id": "MDc6TGljZW5zZTI="
  },

In addition, we should use the actual filename from the downloaded file from the download_url For instance,

"download_url": "https://github.com/adobe/aa-client-go/archive/refs/tags/v0.9.1.tar.gz",

The actual filename should be aa-client-go-0.9.1.tar.gz instead of v0.9.1.tar.gz

chinyeungli avatar Apr 16 '25 00:04 chinyeungli

The filename issue can use the code from https://github.com/aboutcode-org/purldb/pull/608 . Once the PR is granted, we can use the code there.

chinyeungli avatar Apr 16 '25 06:04 chinyeungli