scancode-toolkit
scancode-toolkit copied to clipboard
Scancode 30.1.0 is broken due to commoncode update
As explained on the slack channel of ORT (https://ort-talk.slack.com/archives/C9NNJ54B1/p1663241492555069)
We are running Scancode 30.1.0 with a pip installation. When running Semver4j, we encounter the following exception:
ort:/tmp$ /opt/python/shims/scancode --copyright --license --info --strip-root --timeout 300 --max-in-memory 5000 --processes 3 semver4j/ --json-pp /tmp/ort-ScanCode6979065251175861071/result.json
/opt/python/versions/3.10.6/lib/python3.10/site-packages/cluecode/copyrights.py:3382: FutureWarning: Possible set difference at position 3
remove_tags = re.compile(
Setup plugins...
Collect file inventory...
Scan files for: info, licenses, copyrights with 3 process(es)...
[--------------------] 0
Removing temporary files...done.
Traceback (most recent call last):
File "/opt/python/versions/3.10.6/bin/scancode", line 8, in <module>
sys.exit(scancode())
File "/opt/python/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/opt/python/versions/3.10.6/lib/python3.10/site-packages/commoncode/cliutils.py", line 70, in main
return click.Command.main(
File "/opt/python/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/opt/python/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/python/versions/3.10.6/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/opt/python/versions/3.10.6/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/opt/python/versions/3.10.6/lib/python3.10/site-packages/scancode/cli.py", line 451, in scancode
success, _results = run_scan(
File "/opt/python/versions/3.10.6/lib/python3.10/site-packages/scancode/cli.py", line 887, in run_scan
scan_success = run_scanners(
File "/opt/python/versions/3.10.6/lib/python3.10/site-packages/scancode/cli.py", line 1125, in run_scanners
scan_success = scan_codebase(
File "/opt/python/versions/3.10.6/lib/python3.10/site-packages/scancode/cli.py", line 1236, in scan_codebase
scan_timings) = next(scans)
File "/opt/python/versions/3.10.6/lib/python3.10/site-packages/click/_termui_impl.py", line 116, in __next__
return next(iter(self))
File "/opt/python/versions/3.10.6/lib/python3.10/site-packages/commoncode/cliutils.py", line 173, in generator
for rv in self.iter:
File "/opt/python/versions/3.10.6/lib/python3.10/site-packages/scancode/pool.py", line 52, in wrap
return func(self, timeout=timeout or 3600)
File "/opt/python/versions/3.10.6/lib/python3.10/multiprocessing/pool.py", line 873, in next
raise value
AttributeError: 'ScannedResource' object has no attribute 'rid'
According to @pombredanne , this is because of an update in commoncode:
There have been major incompatible updates in the commoncode library since scancode-toolkit 30.1.0 was released. There are basically two ways to install scancode:
- using the all-in-one application archives that contain all the exact dependenc versions, vendored, and tested and guaranteed to work together.
- using a pip install in which case you are in control of ensuring that you use a set of dependencies that work together. This is typically when you use scancode as a library. We try to avoid setting upper bound on dependency constraints for the reasons explained here https://iscinumpy.dev/post/bound-version-constraints/
The resolution in your case could be:
- use the application archive instead of a pip installation
- use a pip installation using the requirements file from the released version fetched from the git repo as a constraint file to ensure you get the exact same pinned versions as the application archive. For example:
wget https://raw.githubusercontent.com/nexB/scancode-toolkit/v30.0.1/requirements.txt # fetch the requirements file of the proper version
python -m venv # create an isolated virtualenv in the "venv" dir using the "venv" standard module
venv/bin/pip install -U pip setuptools wheel # ensure that core tools are using latest possible stable versions
venv/bin/pip install scancode-toolkit==30.0.1 --constraint requirements.txt # install using a constraints file
venv/bin/scancode --help
In anycase file an issue in ScanCode so we have proper visibility in this... There is likely some further vendoring of some key libraries we could do in all cases to shield from some libraries being updated in the future. I cannot fix this in this old versions short of pushing a release for code that's over a year old... If I can avoid this, that would be best since there is a workaround
for reference, this was introduced by these https://github.com/nexB/commoncode/commit/605d6984081a15689d69a6a7626dbd6b0ce50a52 https://github.com/nexB/commoncode/blob/main/CHANGELOG.rst#version-3100---2022-08-24
Thank you! Somehow I vaguely recalled doing something about this not too long ago. And @mnonnenmacher pointed to https://github.com/oss-review-toolkit/ort/commit/a243bdaa135c47140c5e3585c97d6bb4a89e727e and https://github.com/oss-review-toolkit/ort/pull/5692
In anycase we need to have a better to dela with this in the future
@nnobelis any workarounds for this?
@mpcen Please have a look of the recommendations of @pombredanne above on how to solve this:
The resolution in your case could be:
use the application archive instead of a pip installation
use a pip installation using the requirements file from the released version fetched from the git repo as a constraint file to ensure you get the exact same pinned versions as the application archive.