cve-bin-tool icon indicating copy to clipboard operation
cve-bin-tool copied to clipboard

Package List processing improvements

Open anthonyharrison opened this issue 4 years ago • 5 comments

A few comments on the Package List processing which is performed by cve-bin-tool.

Pinging @BreadGenie for initial comment.

BUG If the distribution is not supported (e.g. Kali, Raspian) the error message is "Invalid Package list". A more descriptive error message may be more appropriate e.g. 'Unsupported Distribution'. However it might be even better to do something earlier in checkfile before attempting to process the file e.g.

        if distro.id() not in SUPPORTED_DISTROS:
            LOGGER.warning(
                f"Package list support only available for {','.join(SUPPORTED_DISTROS)}!"
            )
            with ErrorHandler(mode=error_mode):
                raise InvalidListError("Distro is not supported")

ENHANCMENT Instead of detecting the type of distribution of the machine running the tool, it might be a good idea to specify the distribution type (e.g. debian, rpm, arch, windows, bsd) with the filename e.g. debian#/tmp/kali.txt as this would then allow the tool to process package lists from other machines (which is likely to be a typical use case).

BUG Checking a package list (obtained using dpkg-query) and attempting to do a trial install results in a InvalidListError: Invalid Package list exception together with list of modules with no installation candidate. It is not clear why an exception is being raised rather than a warning; it also isn't clear why this check is being performed as overriding the check doesn not have any effect on the processing. Suggested change

                    LOGGER.info(f"Package not found\n{output.stderr.decode('utf-8')}")
                #     with ErrorHandler(mode=error_mode):
                #         raise InvalidListError(
                #             f"Invalid Package list\n{output.stderr.decode('utf-8')}"
                #         )

ENHANCEMENT Instead of querying package manager for the version of each package, why is this not included in the file format which would then speed up the processing of the list?

ENHANCEMENT Reduce the amount of output generated by changing the logging level to DEBUG in cli.py.

REFACTOR The Vendor Fetch class has direct acccess to the database. Suggest that this is encapsulated within the cvedb module to localise all database access/queries.

BUG There are numerous reports of products with UNKNOWN Vendor and hence no reported CVEs. Why are these being reported in the "New Found CVEs" section (of console output). Similarly there are products with 'guessed' vendors but no reported CVEs. The reported CVEs are in the 'Unexplored CVEs' section which seems strange

BUG In the example file from a Kali system, dpkg is at version 1.20.9Kali. The reported CVEs for dpkg include

debian* │ dpkg │ 1.20.9kali1 │ CVE-2015-0840 │ MEDIUM │ 4.3 (v2)

However looking at the CVE database, CVE-2015-0840 is reported against version 1.17.x of dpkg. Is there an issue in parsing version numbers that don't conform the PEP standard?

anthonyharrison avatar Oct 25 '21 21:10 anthonyharrison

BUG If the distribution is not supported (e.g. Kali, Raspian) the error message is "Invalid Package list". A more descriptive error message may be more appropriate e.g. 'Unsupported Distribution'. However it might be even better to do something earlier in checkfile before attempting to process the file e.g.

This is great. I only thought of capturing the invalid package lists and tested that most of the time.

ENHANCMENT Instead of detecting the type of distribution of the machine running the tool, it might be a good idea to specify the distribution type (e.g. debian, rpm, arch, windows, bsd) with the filename e.g. debian#/tmp/kali.txt as this would then allow the tool to process package lists from other machines (which is likely to be a typical use case).

Hmm maybe a more elegant solution would be a 2nd argument for the -L flag (need more investigation if that's possible or we can go with naming the files).

BUG Checking a package list (obtained using dpkg-query) and attempting to do a trial install results in a InvalidListError: Invalid Package list exception together with list of modules with no installation candidate. It is not clear why an exception is being raised rather than a warning; it also isn't clear why this check is being performed as overriding the check doesn not have any effect on the processing.

Hmm then we should change this for other distros' cases too.

ENHANCEMENT Instead of querying package manager for the version of each package, why is this not included in the file format which would then speed up the processing of the list?

Can you explain this a bit more? I didn't get how versions are included in file formats.

REFACTOR The Vendor Fetch class has direct acccess to the database. Suggest that this is encapsulated within the cvedb module to localise all database access/queries.

Agreed.

BUG There are numerous reports of products with UNKNOWN Vendor and hence no reported CVEs. Why are these being reported in the "New Found CVEs" section (of console output). Similarly there are products with 'guessed' vendors but no reported CVEs. The reported CVEs are in the 'Unexplored CVEs' section which seems strange

#1132 related?

BreadGenie avatar Oct 26 '21 06:10 BreadGenie

I'm filing some of these issues individually while working on it

BreadGenie avatar Oct 26 '21 06:10 BreadGenie

@BreadGenie

In response to your comment 'Can you explain this a bit more? I didn't get how versions are included in file formats.', the following command does what I was proposing.

dpkg-query --show --showformat='{"name": "${binary:Package}", "version": "${Version}"}' > pkg-list

If we used this command instead of dpkg-query -W -f '${binary:Package}\n' > pkg-list we wouldn't need to call dpkg-query in parse_list nor the call to xargs call in check_file as we could just load the json file to create a list of installed packages.

Clearly commands for other types of distro would need to be updated.

anthonyharrison avatar Oct 26 '21 16:10 anthonyharrison

@BreadGenie Instead of -L parameter having two parameters maybe we should have an extra option -D (for distro)? Default for -D could be current distro (obtained from distro.id()). This would remove the need to call distro.id() in the package_list parser module with distro being set during the creation of the class (so we have self.distro).

anthonyharrison avatar Oct 26 '21 16:10 anthonyharrison

@anthonyharrison if I get it right if we have versions in the package list then we don't need to query dpkg-query or other package managers right? This is would be a great add-on. Thanks for the explanation.

Btw I think we should go with something like product==version format so that we could stay consistent with python package lists.

BreadGenie avatar Oct 26 '21 16:10 BreadGenie

I think most of the solvable stuff in here has been solved, and we're focusing more on the SBOM parsers at this point so I don't think anyone's going to be spending a bunch of time on the package list parser anytime soon. So I'm going to close this now.

terriko avatar Apr 17 '24 21:04 terriko