scancode.io
scancode.io copied to clipboard
Map archives when their extracted directory mapped/processed
In a deploy_to_devel pipeline, when I have an archive like "foo.zip", there will be a directory "foo.zip.extract" with the extracted content.
-
If "foo.zip" is matched to the purlDB then "foo.zip.extract" should be treated as matched too. This is already the case.
-
Otherwise, if not matched to the devel side and not matched to the purldb, "foo.zip" should be assigned an "extracted" status right at the end of the step that is matching archives to the purldb and this archive should not be further processed. This works because we have its extracted content that is processed otherwise.
We need to validate that:
- all the "is_archive" files are extracted. These files not extracted are possible errors.
- we are trying to match every "is_archive" file. This need changing and is tracked in #826