vulnerablecode
vulnerablecode copied to clipboard
Better track and document ignored versions
Some data source or ecosystems have weird stuff. We are for now tracking some of these as ignoreable version for instance in the GitHub Importer.
The more I see these ignorable_versions and in particular the list of "WEIRD_IGNORABLE_VERSIONS," the more I think this is smelling bad or weird. IMHO we should rather have this approach:
- An ignorable versions should be tied to a specific package URL, and not freestanding
- Ideally it would be an IgnorableVersion object with a purl attribute, a list of ignorable versions AND a reason text that explains why we are ignoring these.
- When using these, we should only ignore things IFF we have a proper matching purl so that we are super restrictive about the context.
- In some case we may even consider still importing these invalid data but mark them as as invalid, not usable somehow with some falf in the models.
Otherwise, I feel we are likely to skip things silently and that's a problem.
Some data source or ecosystems have weird stuff. We are for now tracking some of these as ignoreable version for instance in the GitHub Importer.
The more I see these ignorable_versions and in particular the list of "WEIRD_IGNORABLE_VERSIONS," the more I think this is smelling bad or weird. IMHO we should rather have this approach:
* An ignorable versions should be tied to a specific package URL, and not freestanding * Ideally it would be an IgnorableVersion object with a purl attribute, a list of ignorable versions AND a reason text that explains why we are ignoring these. * When using these, we should only ignore things IFF we have a proper matching purl so that we are super restrictive about the context. * In some case we may even consider still importing these invalid data but mark them as as invalid, not usable somehow with some falf in the models.
Otherwise, I feel we are likely to skip things silently and that's a problem.
You might consider putting them in an external configuration file or data file. this way you don't have to change the script whenever you add some data.