imagededup icon indicating copy to clipboard operation
imagededup copied to clipboard

Ability do install just hashing dependencies

Open BbqGamer opened this issue 1 year ago • 2 comments

The dependencies for CNN based methods are heavy (torch), and considering that a lot of people want to just use hashing methods I think that ability to install just a subset of functionalities would be great.

Idea is to allow for something like:

  • pip install imagededup[cnn] - to install optional dependency all the hasing methods could be installed by default as they are reasonably lightweight

I believe that this feature would make the library even more awesome and approachable and even allow other projects to seamlessly integrate it without needing to add such a big dependency.

BbqGamer avatar Aug 22 '24 20:08 BbqGamer

Alternatively, the CNN methods could be converted use a lightweight inference engine like onnxruntime-gpu. It would probably require reworking a bunch of things like the custom model support, but I agree having a super heavy requirement like torchvision (>1GB) is a dealbreaker in many situations where you just want simple image deduplication.

JeroenDelcour avatar Oct 15 '24 12:10 JeroenDelcour

Thanks for the suggestion. I see 2 ways of proceeding here:

  • Create 2 separate packages- which will fragment the build/deploy pipeline and hence, not a great option.
  • Manage torch (which is the main culprit here) as an optional dependency in pyproject.toml and make it installable via pip install imagededup[cnn]. I would investigate in this direction first.

If there are any other suggestions that might help here, please feel free to share.

tanujjain avatar Jul 28 '25 12:07 tanujjain