georasters
georasters copied to clipboard
Improved performace and correctness for raster union
Moved some stuff around and made some performance improvements to the merge functionality. There were a lot of extra array copies and instances where larger data types than needed were used.
The main improvements were:
- selecting the correct numpy
dtype
for the arrays that were created - selecting a default no data value if there was none
- creating a pre-filled array with the no data value instead of multiplying it with an array of ones
- making the mask start out as an array of bools.
Testing done: Just on my own code, merging 24 rasters into a 43201x86401 array. It just barely works now instead of crashing because it was not able to allocate the arrays.
CI might be broken, it's erroring because it can't find libspatialindex_c, which is something that I noted in #27.
Looks so much nicer. Thanks!
Would be good to test and make sure it still works "as expected". It is on the to-do list in #14. If you feel like contributing some tests that's be really great!
I'll try to see where that libspatialindex_c
is coming from. I do not think the package uses it explicitly. It may be from one of the other GIS
packages, like geopandas
or pysal
.
It looks like rtree
from geopandas
is where it's coming from. I can mock up some tests with some fake data and see if the outputs are the same. I'm pretty sure types are going to be different because they weren't handled before, but I could do something like a mse threshold between outputs to make sure that nothing wonky is happening.
It seems in the CI with py3.5 is failing due to some matplotlib
incompatibility with scikits-image
. On py36
it is rtree
which is needed by geopandas
. Using conda install -c conda-forge georasters
has been working so far without any of these issues, since all the versions of the packages are coherent. I guess pip
is having problems due to the compatibility of packages.