cve-bin-tool
cve-bin-tool copied to clipboard
discussion: bandit linter showing high severity_score for tarfile library
In my recent commit to my PR https://github.com/intel/cve-bin-tool/pull/3543 bandit linter shows that the used library tarfile has high severity_score. However, I went through all the documentations of the repo and also of python. I was not able to find any suitable method to extract tarfiles without using the tarfile library. Even the utility functions that we have in async_utils.py use tarfile library as they call unpack_archive function from shutil.py which calls tarfile to unpack a tarfile. Does anyone have any suggestions on how to tackle this problem? @terriko @anthonyharrison @b31ngd3v @Rexbeast2
Hello, unblob could be used to extract a wide variety of file formats including tar. unblob implemented a SafeTarFile
class as tarfile is indeed vulnerable to path traversal (https://github.com/onekey-sec/unblob/issues/459).
Hello, unblob could be used to extract a wide variety of file formats including tar. unblob implemented a
SafeTarFile
class as tarfile is indeed vulnerable to path traversal (onekey-sec/unblob#459).
Thank you @ffontaine . Will try it in code and let you know how it works
A few things:
- bandit is currently reporting even safe usage of the library (there's an open bug about that, or at least it was open when I checked last week. If we need to just update bandit someone could submit a pr for that.)
- but there isn't an approved safe usage supported by all versions of python that we support
- I've got a tentative setup in https://github.com/intel/cve-bin-tool/pull/3769 but it still doesn't cover everything and is missing a test at the moment.
- Somewhat ironically, it would actually be safer to call tar as a subprocess, the way we used to do 🤦
- If unblob works on windows we may need to integrate it into our extractor.py functions. I haven't looked at it yet so let me know how it works for you if you play with it.
That said, for PR #3543 I would suggest you just call our existing extractor functions rather than calling tarfile directly. That way, when extractor.py is fixed your code will also be fixed.
A few things:
- bandit is currently reporting even safe usage of the library (there's an open bug about that, or at least it was open when I checked last week. If we need to just update bandit someone could submit a pr for that.)
- but there isn't an approved safe usage supported by all versions of python that we support
- I've got a tentative setup in fix: use tarfile extract filters to open tarfiles more safely #3769 but it still doesn't cover everything and is missing a test at the moment.
- Somewhat ironically, it would actually be safer to call tar as a subprocess, the way we used to do 🤦
- If unblob works on windows we may need to integrate it into our extractor.py functions. I haven't looked at it yet so let me know how it works for you if you play with it.
That said, for PR #3543 I would suggest you just call our existing extractor functions rather than calling tarfile directly. That way, when extractor.py is fixed your code will also be fixed.
@terriko I took your advice and changed the code and committed . Next, Can I work on adding a Fuzz Testing code to the DebParser? Also, I'll look into unblob and if it works on windows or not. (Although I just tried to install it in windows and showed some wheel-build related errors :) )
Actually , since we are talking about extractors, I did come across one little thing. The tar files inside the test.deb that we have in test/assets (now also in test/language_data) contains a control.tar.xz file, which our current tar extractor cannot read probably because it has the .xz extension as well. What can we do about it ? @terriko @ffontaine
Actually , since we are talking about extractors, I did come across one little thing. The tar files inside the test.deb that we have in test/assets (now also in test/language_data) contains a control.tar.xz file, which our current tar extractor cannot read probably because it has the .xz extension as well. What can we do about it ? @terriko @ffontaine
Hey @crazytrain328 i was recently working with extractors, we do have .tar.xz extractor it is handled with extract_file_tar() in extractor.py , there is whole list of file extension that are supported you can check out the onces that are useful to your issue.
Hello @mastersans , Thanx for letting me know. The function I had used to extract was the aio_unpack_archive(), which is basically an asynchronous wrapper around the shutil function unpack_archive(). We would want the above function to be able to extract tar.xz files as well right? @terriko Also, @terriko Can you review this whenever you have time. Thanx
I'm gonna close this since it has already been resolved