scancode-toolkit icon indicating copy to clipboard operation
scancode-toolkit copied to clipboard

scancode-toolkit now depends on pyicu

Open Zinurist opened this issue 4 months ago • 7 comments

Description

Last week a dependency of fingerprints (which scancode-toolkit uses) was updated so that it now also depends on pyicu. Which means scancode-toolkit now also depends on pyicu.

I don't think that's intentional, I opened an issue here: https://github.com/opensanctions/fingerprints/issues/23

Also worth mentioning, fingerprints is now unmaintained, see the notice in the repo: https://github.com/opensanctions/fingerprints They mention rigour should be used instead.

How To Reproduce

Installing scancode-toolkit 32.4.1 via pip will try to install pyicu. Which for example fails on my system, since pyicu has some extra requirements when installing with pip.

Zinurist avatar Jul 29 '25 06:07 Zinurist

Thanks for the report @Zinurist

See also https://github.com/opensanctions/fingerprints/issues/23#issuecomment-3133117919 and https://github.com/pudo/normality/commit/56aaf775356d376db87f74048dc65c7435e2c5b4, previously they soft-depended on pyicu, but now they are making this a hard requirement as transliteration is broken otherwise.

I am not too sure about the effect on scancode, so I will look for broken tests first to see changes in functionality and then look into how this effects us.

See also from https://github.com/aboutcode-org/scancode-toolkit/pull/4498#discussion_r2247707824

We cannot upgrade fingerprints just yet, because they have removed support for python3.9, see https://github.com/opensanctions/fingerprints/blob/8eea92ff357e080c6c8bb0807352e7b5f83d9c14/pyproject.toml#L19C1-L19C28 This is not EOL yet: https://endoflife.date/python Also why the tests are failing because we test and support python3.9. Note also as mentioned in https://github.com/aboutcode-org/scancode-toolkit/issues/4493#issue-3272342439, fingerprints is now unmaintained. So we would want to consider switching to rigour in the future.

Note also that rigour does not support python3.9 too, so we cannot do this update just yet: https://github.com/opensanctions/rigour/blob/main/pyproject.toml#L20

Another thing is they introduce new dependencies: https://github.com/opensanctions/rigour/blob/main/pyproject.toml#L21 so we need to be careful there.

So the short term fix could be to

  1. install fingerprints conditionally based on python version, use 1.2.3 for python 3.9, 1.3.0 for python3.10+ (if this works fine in all supported os/python versions) I've requested this at https://github.com/aboutcode-org/scancode-toolkit/pull/4498#pullrequestreview-3078802024
  2. place an upper bound on fingerprints 1.2.3 as we need to support python3.9 for another couple of months. This is also essentially what was done in scancode.io: https://github.com/aboutcode-org/scancode.io/pull/1796/files

AyanSinhaMahapatra avatar Aug 01 '25 11:08 AyanSinhaMahapatra

It is likely best if we can avoid any dependency on pyicu. It is notoriously hard to use/build this on various cloud and hosting platforms, lke Heroku and Fly, due to its weird dependencies and build environment.

jimjag avatar Aug 07 '25 12:08 jimjag

This also means that "uvx --from scancode-toolkit scancode" fails on all my systems (Linux, MacOS and Windows).

th avatar Aug 13 '25 08:08 th

We have merged the following PR:

  • https://github.com/aboutcode-org/scancode-toolkit/pull/4531

We now need to make a release to fix the issue of pip install scancode-toolkit.

Will keep this issue open anyway as we still need to decide what we need to do long term wrt. fingerprints/rigour and depending on pyicu.

AyanSinhaMahapatra avatar Aug 22 '25 08:08 AyanSinhaMahapatra

Eventually we will need to fork upstream fingerprints and normality as we do not need pyicu accuracy at all.

pombredanne avatar Sep 30 '25 16:09 pombredanne

Has this been released? I've tried uvx, pip, and local and I can't get a functioning cli.

steveoh avatar Nov 17 '25 23:11 steveoh

🦗

steveoh avatar Dec 01 '25 18:12 steveoh