Dependencies in `setup.py` have module conflicts.
Describe the bug
There are two dependencies mentioned in the setup.py file: opencv-python and albumentations and the 'albumentations' depends on opencv-python-headless. The official spec mentioned that the opencv-python package is for the desktop environment, while opencv-python-headless is for the server environment. The documentation also states that these two packages cannot be installed simultaneously (the exact wording is: “There are four different packages (see options 1, 2, 3, and 4 below) and you should SELECT ONLY ONE OF THEM.”). This is because they both use the same module name cv2.
During the installation process using pip, the package installed later will override the cv2 module from the previously installed package (specifically, the modules within the cv2 folders that exist in both packages). Furthermore, the dependency graph even includes different versions of these two packages. It is certain that the common files with the same path in these two packages contain different contents. Therefore, there may be functional implications when using them. However, without analyzing the specific code and function call hierarchy of this project, it can be stated that issues related to overwriting and module conflicts do exist.
To Reproduce
pip install deepforest
Screenshots
Additional context Indeed, it is not an ideal behavior for modules to be overwritten, even if they are not actively used or if the overwritten module is the one being called. It introduces uncertainty and can cause issues in the long run, especially if there are changes or updates to the overwritten modules in future development. It is generally recommended to avoid such conflicts and ensure that only the necessary and compatible dependencies are declared in the requirements to maintain a stable and predictable environment for the project.
We believe that although this project can only modify direct dependencies and indirect dependencies are a black box, it is possible to add additional explanations rather than directly declaring both conflicting packages in the requirements.txt file.
Adding extra explanations or documentation about the potential conflicts and the need to choose only one of the conflicting packages can help developers understand the issue and make informed decisions. Including clear instructions or warnings in the project’s documentation can guide users to choose the appropriate package based on their specific requirements.
Thanks for this. We'll need to look into it. Did this cause a specific error? My gut feeling is that we can just rid of albumentations. It has caused problems and i'm not convinced we actually need it. Let's just go back to torchvision transforms and users can always add on from there.
Is this issue reproducible? It seems unlikely this is a broad issue since we install regularly on all sorts of testing environments, so this must be an edge case of sometime. I lean towards closing unless someone can duplicate.
Is this issue reproducible? It seems unlikely this is a broad issue since we install regularly on all sorts of testing environments, so this must be an edge case of sometime. I lean towards closing unless someone can duplicate.
The fact that both are installed at the same time is not necessarily a serious problem, as it may overwrite modules that are not used or imported by your project, but it still causes the integrity of the user's native package to be compromised. However, this is also an aspect to consider when problems arise in the future.
Thanks @unsatisfying - this is a good point and I should have read the initial issue a little more carefully. We'll see what we can do
From the opencv docs:
You should always use [the headless] packages if you do not use cv2.imshow et al. or you are using some other package (such as PyQt) than OpenCV to create your GUI.
We currently only use cv2.imshow once in visualize.view_dataset:
https://github.com/weecology/DeepForest/blob/bda634ef4c9e8adfdc9b46410673b0d9cc174132/deepforest/visualize.py#L16-L35
This function doesn't appear to be used elsewhere in the codebase and based on the docstring may only be being used for debugging. Given that I'm wondering if we would change that function so that it plots using matplotlib, which is what we do everywhere else. Then we could just switch to opencv-python-headless, which is also a less heavy dependency.
I think that would be a great idea!