cve-bin-tool
cve-bin-tool copied to clipboard
feat: added debian parser
closes #2917
Only review and Merging left @terriko @anthonyharrison
@crazytrain328 Can you please add some tests and include some sample data to demonstrate the parser working.
@anthonyharrison Could you tell me how to do that? I have worked on adding fuzz testing to parsers before. Do i do the same here?
Codecov Report
Attention: Patch coverage is 57.42574%
with 43 lines
in your changes are missing coverage. Please review.
Project coverage is 80.35%. Comparing base (
d6cbe40
) to head (d0b260a
). Report is 179 commits behind head on main.
Files | Patch % | Lines |
---|---|---|
cve_bin_tool/parsers/deb.py | 55.29% | 34 Missing and 4 partials :warning: |
test/test_language_scanner.py | 66.66% | 4 Missing and 1 partial :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #3543 +/- ##
==========================================
+ Coverage 75.41% 80.35% +4.93%
==========================================
Files 808 823 +15
Lines 11983 12799 +816
Branches 1598 1999 +401
==========================================
+ Hits 9037 10284 +1247
+ Misses 2593 2089 -504
- Partials 353 426 +73
Flag | Coverage Δ | |
---|---|---|
longtests | 75.27% <53.46%> (-0.15%) |
:arrow_down: |
win-longtests | 78.54% <57.42%> (?) |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Hello @terriko I was busy with my end semester exams (finally they are over). Finally I'm free and I'll be able to contribute regularly. Thanx for the help on this.
I am supposed to get a list of the Deb_products to add in the script only after i run a test with DebParser using a test.deb file. I chose to use the test.deb file in the test/assets section. I am using the following code to test it
import sys
import os
sys.path.append('/home/joydeep/dev/cve-bin-tool')
from cve_bin_tool.parsers.deb import DebParser
from cve_bin_tool.cvedb import CVEDB
from cve_bin_tool.log import LOGGER
cve_db= CVEDB()
logger= LOGGER
file_path = os.path.join(os.getcwd(), 'test.deb')
deb_parser= DebParser(cve_db=cve_db,logger=logger)
deb_parser.run_checker(file_path)
I have brought the test.deb file into the same directory as my testing file. I think there is something wrong in the way im calling Logger. Can you help me @terriko @anthonyharrison ?
@crazytrain328
The test is OK in your local environment to prove the functionality.. However what we need is a test within the cve-bin-tool test environment where we can add it to the test suite.
The language parsers all have test files in the test/language_data directory. Can you add your test.deb file in this directory and then update the test_language_scanner file to add your test code. I suggest you add a new test test_debian_package which calls the scanner and then asserts that the results are as expected.
Can you confirm that the parser is doing more than is already covered in the extractor module and tested in the test_extractor file which explicitly has a a test for files with a .deb extension.
@anthonyharrison This test is not working in my Local Environment. It executes but it does not give any output . Since all the outputs to the console in the run_checker() function is through the logger object, I thought the way in which Im using logger in my test code is wrong.
I tried to change a few things but my local test still gives no output. For a change, I set the logging level down to DEBUG, but that does not help.
My code for testing:
import sys
import os
import logging # Import the logging module
sys.path.append('/home/joydeep/dev/cve-bin-tool')
from cve_bin_tool.parsers.deb import DebParser
from cve_bin_tool.cvedb import CVEDB
from cve_bin_tool.log import LOGGER
LOGGER.setLevel(logging.DEBUG)
cve_db = CVEDB()
logger = LOGGER
file_path = os.path.join(os.getcwd(), 'test.deb')
deb_parser = DebParser(cve_db=cve_db, logger=logger)
deb_parser.run_checker(file_path)
Modified run_checker() function:
def run_checker(self, filename):
"""Process .deb control file with file existence check"""
self.logger.debug(f"Scanning .deb control file: {filename}")
# Check if the file exists
if not os.path.exists(filename):
self.logger.error(f"File not found: {filename}")
return # Exit the method if file doesn't exist
try:
with open(filename) as file:
control_data = file.read()
product, version = self.parse_control_file(control_data)
if product and version:
product_info = self.find_vendor(product, version)
if product_info:
yield from product_info
else:
self.logger.debug(f"No product/version found in {filename}")
except Exception as e:
self.logger.error(f"Error processing file {filename}: {e}")
self.logger.debug(f"Done scanning file: {filename}")
Im stuck! Please help @terriko @anthonyharrison.
@crazytrain328 Can you provide the test.deb file that you are using?
Tried to run the parser in my environment. The run_checker routine wan't being called. However if I call a different routine in the class it does get called so I suspected there was something wrong with the way run-checker is defined/being called.
I created the following routine
def do_it(self, filename):
print ("DO IT")
try:
print ("Read file")
with open(filename) as file:
control_data = file.read()
print ("File read")
except:
print ("We have a problem")
print ("DONE IT")
And called this instead of run_checker. This resulted in the exception being called when reading a .deb file (I used the test.deb file in the test/assets directory). Renaming this as run_checker, does result in the run_checker being called. So I think you need to work through the run_checker routine line by line to validate the operation. Using print statement rather than logging may also help.
@anthonyharrison I am also using the test.deb in the test/assets directory. But I will go through the run_checker() definition once again.
Have you updated the parse.py file? This calls the appropriate parser when it finds a particular file e.g. requirements.txt will invoke the python parser.
On Fri, 8 Dec 2023, 08:59 Joydeep Tripathy, @.***> wrote:
@anthonyharrison https://github.com/anthonyharrison I am also using the test.deb in the test/assets directory. But I will go through the run_checker() definition once again.
— Reply to this email directly, view it on GitHub https://github.com/intel/cve-bin-tool/pull/3543#issuecomment-1846795700, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACAID23RI566OV5RRK3ETEDYILJFNAVCNFSM6AAAAAA74R5N46VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBWG44TKNZQGA . You are receiving this because you were mentioned.Message ID: @.***>
I was able to run the run_checker() function. But now I am getting an Exception : Unable to open database file.
I created my own .deb file and am trying to parse its control file with my DebParser. I tried to call the GoParser with the go.mod file as well and it is giving me the same error.(Database one) Heres the local testing code:
import os
import sys
import logging
sys.path.append('/home/joydeep/dev/cve-bin-tool')
from cve_bin_tool.parsers.deb import DebParser
from cve_bin_tool.cvedb import CVEDB
from cve_bin_tool.log import LOGGER
# Set logger to DEBUG level
LOGGER.setLevel(logging.DEBUG)
# Verify the test file path
file_path = '/home/joydeep/mypackage/DEBIAN/control'
if not os.path.exists(file_path):
raise FileNotFoundError(f"The file {file_path} does not exist.")
# Instantiate the database and the parser
cve_db = CVEDB()
deb_parser = DebParser(cve_db=cve_db, logger=LOGGER)
# Run the parser
try:
for info in deb_parser.run_checker(file_path):
print(info)
except Exception as e:
LOGGER.error(f"Exception occurred: {e}")
Is there something wrong with the way Im using CVEDB? @anthonyharrison @terriko
If I had to guess, the database problem is that your database hasn't been created or updated in a while. So you're making a new CVEDB() but you're not populating it.
To update your local db and make sure it's functioning:
cve-bin-tool -u now main/test/csv/triage.csv
(The file doesn't matter, I just chose one from our test suite so you can see if the database is working in other code.)
And in your code you'd probably want to call cvedb.get_cvelist_if_stale()
to do the equivalent update. That said, this is where using the existing pytest harness would help you a lot over writing a separate test, as we already have database setup code and stuff in the existing test/test_* files and when you run it all in github actions you have access to the cached database so you don't have to initialize it yourself. I'd strongly recommend that you move your test into pytest and use the existing framework before spending too long debugging this: you're going to have to do it eventually anyhow because we need all tests to run through there before this code can be merged, so might as well just learn to do it that way first instead of figuring it out twice.
If I had to guess, the database problem is that your database hasn't been created or updated in a while. So you're making a new CVEDB() but you're not populating it.
To update your local db and make sure it's functioning:
cve-bin-tool -u now main/test/csv/triage.csv
(The file doesn't matter, I just chose one from our test suite so you can see if the database is working in other code.)
And in your code you'd probably want to call
cvedb.get_cvelist_if_stale()
to do the equivalent update. That said, this is where using the existing pytest harness would help you a lot over writing a separate test, as we already have database setup code and stuff in the existing test/test_* files and when you run it all in github actions you have access to the cached database so you don't have to initialize it yourself. I'd strongly recommend that you move your test into pytest and use the existing framework before spending too long debugging this: you're going to have to do it eventually anyhow because we need all tests to run through there before this code can be merged, so might as well just learn to do it that way first instead of figuring it out twice.
How do I get the DEBIAN_PRODUCTS which i have to add in the test/test_language_scanner.py
file, when i write the test using the existing pytest setup?
How do I get the DEBIAN_PRODUCTS which i have to add in the
test/test_language_scanner.py
file, when i write the test using the existing pytest setup?
Usually you'd make this manually (i.e. cut and paste the data that you used when you created the file).
For example, if you look at https://github.com/intel/cve-bin-tool/blob/main/test/language_data/requirements.txt and then at the PYTHON_PRODUCTS array in https://github.com/intel/cve-bin-tool/blob/main/test/test_language_scanner.py you'll see that the test is just a subset of what could have been detected from the file.
In your case, since a debian package often contains only one product, you may have an array that's just the one thing you put into the metadata of the file, so you could probably write something like
def test_python_package(self, filename: str) -> None:
assert scanner.scan_file(filename) == "debian_package"
Although you'll have to account for it returning an array rather than a single string or whatever it actually does (sorry, I've got to run to a meeting so I don't have time to double-check the api myself, but you can probably figure it out from the other tests!)
Oh, and if you want to run just your new test to see how it works on your system, you can use the -k option:
pytest -vv -k test_control.deb
should probably get you just the new piece you added so you don't have to wait for a whole file worth of tests (or the whole test suite!) to complete.
Oh, and if you want to run just your new test to see how it works on your system, you can use the -k option:
pytest -vv -k test_control.deb
should probably get you just the new piece you added so you don't have to wait for a whole file worth of tests (or the whole test suite!) to complete.
All the products that my test_control.deb has I have listed in the DEBIAN_PRODUCTS list ..I also modified the debparser code to be able to extract the products and their versions more efficiently, but still it does not give me the desired output.
One thing I read about debian control files is that while the actual package has a .deb extension, the control file inside the package (which basically contains metadata about the debian package) is actually a text file (without extension). Should I write my tests to process a control.txt file?
Typically, you'd want to have the test process a .deb and find and parse the control.txt, so... both?
So I've been trying to find ways on how to unpack Debian packages using python and so far haven't had any luck in that . The test.deb file in test/assets has a structure like: test.deb --->control.tar.xz ---> . ---> control --->usr/bin ... Need Ideas on how to proceed.
We have a deb extractor in extractor.py. I think it's called extract_file_deb
or something equally obvious. You could probably just use that.
Hello @terriko , @anthonyharrison @b31ngd3v @Rexbeast2 Finally I was able to make my code parse a debian package and bring out the contents of its control file. However, I wouldnt expect any cves to actualy be present since this is purely a test file. So, if I write the usual test in test_language_scanner , Im bound to get an assertion error. How do i solve this? Also , please help me with the issue in bandit linter as it says that tarfile library has high severity. I went through all the docs and the ways and still couldnt find a way to build a tarfile extractor without using that library. Even the internal library of python uses a function shutil.unpack_archive which in turn uses _extract_tarfile functions which uses tarfile library. What to do ?
Only review and merge left. Thanx @terriko (I want to remove some of the comments from the code but If i do it now it will get stuck in the CI due to failing tests. Will open another doc issue for that)
Eagerly Waiting for review :)
Just a heads up: this has a bunch of merge conflicts now and will need some work.
I'll get back to solving this as soon as we have the PURL generation for language parsers and their tests figured out. @terriko @anthonyharrison
Marking this as blocked so I don't look at it again until after 3.3 is out.
Marking this as blocked so I don't look at it again until after 3.3 is out.
Sure! I did mention working on this issue as part of stretch goals in my GSOC project. Maybe I'll get to it in the community bonding period.
Btw, When will the 3.3 version be coming out? @terriko
Hello @terriko , Since the release is out I was thinking we can finally work on finishing this one. Or should this be prioritised after 3.3.1 release is out? (Merge conflicts have been resolved)
I've barely started with 3.3.1 planning so I expect this will get merged long before there, but it's going to be at least a few weeks. I'm severely backlogged on non-cve-bin-tool stuff at the moment and have to put my focus elsewhere.
I've barely started with 3.3.1 planning so I expect this will get merged long before there, but it's going to be at least a few weeks. I'm severely backlogged on non-cve-bin-tool stuff at the moment and have to put my focus elsewhere.
Absolutely no problem at all! I think this one is almost ready to merge except that I would have to add a script which creates a temporary debian file to test the code itself. But I thought it may end up being more expensive than what we are doing now.
As for testing files: for now, let's got with just including the .deb in git. I'm going to have to deal with the OpenSSF's insistence on there not being binary files eventually but I think at the moment it's more important to me that we have a functional test if it's not super easy to just have a makefile for it or something.