Issue with `distance_to_anomaly_gdal` and `gdal_array`
At least one Windows user has reported that gdal_array was not available for them, so they could not run the optimized distance_to_anomaly tool. @okolekar , how did you set up your development environment / did you do anything specific to have gdal_array available for you?
Before we can be sure that Windows users are able to run distance_to_anomaly_gdal without issues, we can't make it the default/automatically selected version for EIS QGIS Plugins users that have Windows.
At least one Windows user has reported that
gdal_arraywas not available for them, so they could not run the optimizeddistance_to_anomalytool. @okolekar , how did you set up your development environment / did you do anything specific to havegdal_arrayavailable for you?Before we can be sure that Windows users are able to run
distance_to_anomaly_gdalwithout issues, we can't make it the default/automatically selected version for EIS QGIS Plugins users that have Windows.
There are issues associated with GDAL >= 3.9, Python >= 3.9 and NumPy 2.0.
It is recommended to use 'GDAL 3.6.2, released 2023/01/02' with python 3.9.18.
Also, In order to enable numpy-based raster support, libgdal and its development headers must be installed as well as the Python packages numpy, setuptools, and wheel. To install the Python dependencies and build numpy-based raster support use the following commands for pip users:
pip install numpy>1.0.0 wheel setuptools>=67
pip install gdal[numpy]=="$(gdal-config --version).
For Conda users: - as GDAL can be quite complex to build and install, particularly on Windows and MacOS. Pre built binaries are provided for the conda system. It is recommended to use the following command
conda install -c conda-forge gdal
If the problem still persits then try using the following command
conda install -c conda-forge gdal numpy setuptools wheel
In addition to this if the issue still persists then: -
I would like to know if the issue raised is as below: -
Traceback (most recent call last): File "<string>", line 1, in <module> File "/usr/local/lib/python3.12/dist-packages/osgeo/gdal_array.py", line 10, in <module> from . import _gdal_array ImportError: cannot import name '_gdal_array' from 'osgeo' (/usr/local/lib/python3.12/dist-packages/osgeo/__init__.py)
If yes then the issue is because the pip/conda is reusing a cached GDAL installation. Use the following command to make sure that the correct version is installed and used
For Conda users
conda install -c conda-forge gdal numpy
For pip users
pip install --no-cache --force-reinstall gdal[numpy]=="$(gdal-config --version).*"
Hi @okolekar , sorry I haven't had time to think about this for a while.
I don't have Windows myself so it's difficult for me to test if these methods work. However, users should be able to install ideally everything simply with pip install eis_toolkit, without additional configurations. Do you think that can be achieved even if we use gdal_array?
Hi @nmaarnio, Sorry for this late reply. Ideally conda platform provides everything and it does not rely on the user to take any additional steps. However, with pip it is a bit different as pip does not handle everything like conda. But I think a configuration file should be able to orchestrate everything. I have Windows 10 Pro and on this PC it works with problems on Conda environment. I will try to install the same with pip and let you know.
I see that for EIS Toolkit we need to setup a Conda environment. So this issue should not surface in theory, because Conda takes care of all the requirements.
Hi @okolekar , unfortunately right now conda environments are not as good as just using venv + pip. All users might not have / want to use conda and recently we discovered a license issue that relates to tensorflow and default channel of conda. So we should prioritize that everything works well with venv and pip
The good news is that @msorvoja managed to optimize distance_computation using Numba, at least to some extent. We might not reach the same speeds as with gdal, but this at least relieves the pressure to get this optimization up and running for everybody.
Hi @okolekar , unfortunately right now
condaenvironments are not as good as just usingvenv+pip. All users might not have / want to usecondaand recently we discovered a license issue that relates totensorflowand default channel ofconda. So we should prioritize that everything works well withvenvandpip
I will try to work with venv. And let you know as soon as I am done.
Hi @nmaarnio , I tried to install the gdal library, but the library seems to be a bit stubborn and is not available for the venv + pip users. Unfortunately, conda is the only way to work with. I am checking if it works with OSGeo4W.
you need conda for gdal in windows, especially for the average user you are targeting
you need conda for gdal in windows, especially for the average user you are targeting
Yes I tried a lot yesterday to find a way with pip but it seems it is not available even a wheel file or a precompiled file is not available. There was a precompiled file made available unofficially by Christoph Gohlke but it simply does not exist any more.
Okay, this is unfortunate. We can still offer the optimized tool that uses gdal_array as an option for users of EIS Toolkit, but cannot then use it in the CLI function and for plugin users.
The optimization of distance_computation is close to completion, so we should check how does distance_to_anomaly that uses the optimized distance_computation in the background compare to the gdal_array version of distance_to_anomaly. If the performance is close enough it's good, but if not, then let's explore other ways to optimize distance_to_anomaly. A similar Numba-compatible version could be created I believe.
Hi @okolekar , I am working on this issue again and after some testing I discovered that if Numpy is installed before GDAL, the version of distance_to_anomaly you implemented that uses GDAL seems to work. However, I also discovered that the original distance_to_anomaly_gdal Nikolas implemented a long time ago that calls gdal_proximity from osgeo_utils seems to run without issues or any additional installation steps on Ubuntu and Windows for me and is fast as your implementation – which is still a lot faster than the one that uses Numba right now.
I think we can still include a try-except structure to fallback to the slower version just in case the gdal version does not work for everyone, but otherwise I'll proceed to make all the distance tools to use this version to be as fast as possible. I can tag you as a reviewer when I'm done if you'd have time to take a look and test on your machine