ArchGDAL.jl
ArchGDAL.jl copied to clipboard
filter of an overview gives wrong answer
When I use the filter function on an overview of a band I still get some of the filtered values.
These are not there , when I am first collecting the array and then do the filtering operation. This only happens with an overview that I get from AG.getoverview and not from a band directly. An example file is attached. I find it strange, that there seems to be the same amount of values which for me might mean, that the filtering and the accessing is using different code paths.
I also see strange behaviour when plotting the data. The overview plotted directly seems to be randomly shifted.
julia> rastop = AG.readraster("pyramidmiddle.tif")
GDAL Dataset (Driver: GTiff/GeoTIFF)
File(s):
pyramidmiddle.tif
Dataset (width x height): 938 x 938 (pixels)
Number of raster bands: 1
[GA_ReadOnly] Band 1 (Gray): 938 x 938 (Int16)
julia> b =AG.getband(rastop, 1)
[GA_ReadOnly] Band 1 (Gray): 938 x 938 (Int16)
blocksize: 938×128, nodata: -32768.0, units: 1.0px + 0.0
overviews: (0) 469x469 (1) 235x235
julia> o = AG.getoverview(b,1)
[GA_ReadOnly] Band 1 (Gray): 235 x 235 (Int16)
blocksize: 128×128, nodata: -32768.0, units: 1.0px + 0.0
overviews:
julia> filter(!(==(-9999)), o)
2824-element Vector{Int16}:
-136
-159
-175
-180
-178
-137
-146
⋮
-9999
-9999
-9999
-9999
-9999
-9999
julia> oc = collect(o);
julia> filter(!(==(-9999)), oc)
2824-element Vector{Int16}:
-136
-159
-175
-180
-178
-137
-146
⋮
-167
-161
-156
-160
-151
-157
Looks like a DiskArrays.jl problem to me (otherwise you couldn't use filter at all)
The problem is actually that map doesn't reorder itself so the bitmatrix is wrong:
julia> o[map(!(==(-9999)), o)] # This is what `filter` does underneath
2824-element Vector{Int16}:
-136
-159
-175
-180
-178
-137
-146
⋮
-9999
-9999
-9999
-9999
-9999
-9999
-9999
Where broadcast does work:
julia> o[broadcast(!(==(-9999)), o)]
2824-element Vector{Int16}:
-136
-159
-175
-180
-178
-137
-146
⋮
-164
-167
-161
-156
-160
-151
-157
So this is a DiskArrays.jl issue
Haha this is my fault we are missing collect_similar in the DiskGenerator implementation.
Very good. I am still confused, why this only happens on the overviews and not on the band. Because I just looked at the DiskGenerator that is constructed for both and they look very similar.
julia> go # Generator of the overview
DiskArrays.DiskGenerator{ArchGDAL.IRasterBand{Int16}, ComposedFunction{typeof(!), Base.Fix2{typeof(==), Int64}}}(!Base.Fix2{typeof(==), Int64}(==, -9999), [GA_ReadOnly] Band 1 (Gray): 235 x 235 (Int16)
blocksize: 128×128, nodata: -32768.0, units: 1.0px + 0.0
overviews: )
julia> gb # Generator of the band
DiskArrays.DiskGenerator{ArchGDAL.IRasterBand{Int16}, ComposedFunction{typeof(!), Base.Fix2{typeof(==), Int64}}}(!Base.Fix2{typeof(==), Int64}(==, -9999), [GA_ReadOnly] Band 1 (Gray): 938 x 938 (Int16)
blocksize: 938×128, nodata: -32768.0, units: 1.0px + 0.0
overviews: (0) 469x469 (1) 235x235 )
On the band the block looks like the whole axis (938?), so block iterate is the same as normal iterate
I think map should always collect first so we don't have this problem, and we should add filter for diskarrays that doesn't use map and is just not properly ordered, because we rarely care about order with filter?
Always collecting on map for a DiskArray sounds like very dangerous. I opened https://github.com/meggart/DiskArrays.jl/issues/144 so that we can discuss this further there.