scancode.io icon indicating copy to clipboard operation
scancode.io copied to clipboard

Add pipeline to publish scan to federatedcode

Open keshav-space opened this issue 1 year ago • 4 comments

  • Closes https://github.com/aboutcode-org/federatedcode/issues/23

keshav-space avatar Oct 07 '24 18:10 keshav-space

@pombredanne as per your suggestion, I’ve added the PURL field to the project. We're now using this Project PURL to push the scan result to FederatedCode, provided that:

  • All pipelines have successfully completed
  • Input is a download_url
  • Project PURL has version

keshav-space avatar Oct 23 '24 10:10 keshav-space

@keshav-space Could you provide some context about the need for adding a project_purl field? Why not use the uuid for example? It seems to me that this is not directly related, and the addition of a new concept such as this one should be discussed and handled separately. Also, we recently introduced name and version fields for the project. Allowing for a manually provided PURL does not take into consideration those fields.

tdruez avatar Oct 28 '24 04:10 tdruez

@tdruez

Could you provide some context about the need for adding a project_purl field?

We want to store the scancode.io scan results in git repositories, and we use PURL to determine the git repository and the exact directory path where the scan should be stored. This optional project_purl field would be needed to push the final scan results to FederatedCode.

Why not use the uuid for example?

Project uuid would be specific to a particular scancode.io instance. We want to store package scan/vulnerability data in a way that it can be retrieved using just the PURL, which won't be possible with uuid.

It seems to me that this is not directly related, and the addition of a new concept such as this one should be discussed and handled separately.

Sure, let's discuss this and we can split this into two different PRs.

Also, we recently introduced name and version fields for the project. Allowing for a manually provided PURL does not take into consideration those fields.

My understanding was that the product name and product version were closely related to DejaCode.

keshav-space avatar Oct 29 '24 09:10 keshav-space

We want to store the scancode.io scan results in git repositories, and we use PURL to determine the git repository and the exact directory path where the scan should be stored. This optional project_purl field would be needed to push the final scan results to FederatedCode.

This should be documented in the code.

My understanding was that the product name and product version were closely related to DejaCode.

You're right, this seems quite untreated.

tdruez avatar Oct 29 '24 10:10 tdruez

is there something missing to get this merged?

pombredanne avatar Nov 06 '24 11:11 pombredanne

is this https://github.com/aboutcode-org/scancode.io/pull/1400/files#diff-71c80d25cae67eed0aa112b1d847002632d97e7f223d9df6109d39d9e26bc577 a wanted change?

@tdruez yes, namespace package directory should not contain __init__.py https://packaging.python.org/en/latest/guides/packaging-namespace-packages/#native-namespace-packages.

That file is needed for proper packaging.

In that case, instead of having an empty __init__.py, we can have a __init__.py with the following code:

import pkgutil

__path__ = pkgutil.extend_path(__path__, __name__)

This should work for both packaging and namespace package.

keshav-space avatar Nov 12 '24 11:11 keshav-space

@tdruez yes, namespace package directory should not contain init.py

Fair enough, but it seems quite unrelated to the context of this PR. It would be better to open an issue for discussion.

tdruez avatar Nov 12 '24 11:11 tdruez

Fair enough, but it seems quite unrelated to the context of this PR. It would be better to open an issue for discussion.

It is related to this PR because the pipeline uses another namespace package, aboutcode.hashid. If we place an empty __init__.py in our local aboutcode directory, the resolution for all the aboutcode namespace packages will fail.

  File "/scancode.io/scanpipe/pipelines/publish_to_federatedcode.py", line 25, in <module>
    from scanpipe.pipes import federatedcode
  File "/scancode.io/scanpipe/pipes/federatedcode.py", line 35, in <module>
    from aboutcode import hashid
ImportError: cannot import name 'hashid' from 'aboutcode' (/scancode.io/aboutcode/__init__.py)
make: *** [Makefile:126: test] Error 1

keshav-space avatar Nov 12 '24 11:11 keshav-space

Thanks for the clarification. Let's merge then!

tdruez avatar Nov 12 '24 12:11 tdruez