DeepForest icon indicating copy to clipboard operation
DeepForest copied to clipboard

Dependencies in `setup.py` have module conflicts.

Open unsatisfying opened this issue 2 years ago • 4 comments

Describe the bug There are two dependencies mentioned in the setup.py file: opencv-python and albumentations and the 'albumentations' depends on opencv-python-headless. The official spec mentioned that the opencv-python package is for the desktop environment, while opencv-python-headless is for the server environment. The documentation also states that these two packages cannot be installed simultaneously (the exact wording is: “There are four different packages (see options 1, 2, 3, and 4 below) and you should SELECT ONLY ONE OF THEM.”). This is because they both use the same module name cv2.

During the installation process using pip, the package installed later will override the cv2 module from the previously installed package (specifically, the modules within the cv2 folders that exist in both packages). Furthermore, the dependency graph even includes different versions of these two packages. It is certain that the common files with the same path in these two packages contain different contents. Therefore, there may be functional implications when using them. However, without analyzing the specific code and function call hierarchy of this project, it can be stated that issues related to overwriting and module conflicts do exist.

To Reproduce pip install deepforest

Screenshots image

Additional context Indeed, it is not an ideal behavior for modules to be overwritten, even if they are not actively used or if the overwritten module is the one being called. It introduces uncertainty and can cause issues in the long run, especially if there are changes or updates to the overwritten modules in future development. It is generally recommended to avoid such conflicts and ensure that only the necessary and compatible dependencies are declared in the requirements to maintain a stable and predictable environment for the project.

We believe that although this project can only modify direct dependencies and indirect dependencies are a black box, it is possible to add additional explanations rather than directly declaring both conflicting packages in the requirements.txt file.

Adding extra explanations or documentation about the potential conflicts and the need to choose only one of the conflicting packages can help developers understand the issue and make informed decisions. Including clear instructions or warnings in the project’s documentation can guide users to choose the appropriate package based on their specific requirements.

unsatisfying avatar Jul 27 '23 05:07 unsatisfying

Thanks for this. We'll need to look into it. Did this cause a specific error? My gut feeling is that we can just rid of albumentations. It has caused problems and i'm not convinced we actually need it. Let's just go back to torchvision transforms and users can always add on from there.

bw4sz avatar Aug 03 '23 17:08 bw4sz

Is this issue reproducible? It seems unlikely this is a broad issue since we install regularly on all sorts of testing environments, so this must be an edge case of sometime. I lean towards closing unless someone can duplicate.

ethanwhite avatar Mar 25 '24 23:03 ethanwhite

Is this issue reproducible? It seems unlikely this is a broad issue since we install regularly on all sorts of testing environments, so this must be an edge case of sometime. I lean towards closing unless someone can duplicate.

The fact that both are installed at the same time is not necessarily a serious problem, as it may overwrite modules that are not used or imported by your project, but it still causes the integrity of the user's native package to be compromised. However, this is also an aspect to consider when problems arise in the future.

unsatisfying avatar Apr 09 '24 09:04 unsatisfying

Thanks @unsatisfying - this is a good point and I should have read the initial issue a little more carefully. We'll see what we can do

ethanwhite avatar Apr 17 '24 11:04 ethanwhite

From the opencv docs:

You should always use [the headless] packages if you do not use cv2.imshow et al. or you are using some other package (such as PyQt) than OpenCV to create your GUI.

We currently only use cv2.imshow once in visualize.view_dataset: https://github.com/weecology/DeepForest/blob/bda634ef4c9e8adfdc9b46410673b0d9cc174132/deepforest/visualize.py#L16-L35

This function doesn't appear to be used elsewhere in the codebase and based on the docstring may only be being used for debugging. Given that I'm wondering if we would change that function so that it plots using matplotlib, which is what we do everywhere else. Then we could just switch to opencv-python-headless, which is also a less heavy dependency.

ethanwhite avatar Aug 11 '24 15:08 ethanwhite

I think that would be a great idea!

bw4sz avatar Aug 11 '24 22:08 bw4sz