Ability do install just hashing dependencies
The dependencies for CNN based methods are heavy (torch), and considering that a lot of people want to just use hashing methods I think that ability to install just a subset of functionalities would be great.
Idea is to allow for something like:
pip install imagededup[cnn]- to install optional dependency all the hasing methods could be installed by default as they are reasonably lightweight
I believe that this feature would make the library even more awesome and approachable and even allow other projects to seamlessly integrate it without needing to add such a big dependency.
Alternatively, the CNN methods could be converted use a lightweight inference engine like onnxruntime-gpu. It would probably require reworking a bunch of things like the custom model support, but I agree having a super heavy requirement like torchvision (>1GB) is a dealbreaker in many situations where you just want simple image deduplication.
Thanks for the suggestion. I see 2 ways of proceeding here:
- Create 2 separate packages- which will fragment the build/deploy pipeline and hence, not a great option.
- Manage torch (which is the main culprit here) as an optional dependency in
pyproject.tomland make it installable viapip install imagededup[cnn]. I would investigate in this direction first.
If there are any other suggestions that might help here, please feel free to share.