whitebox-tools
whitebox-tools copied to clipboard
Inconsistent Flow Accumulation Output Between Approaches
Hi, thanks for all the great work on WBT!
I'm processing DEM to delineate basins and am encountering unexpected behaviour on medium-large basins, rather, on flow accumulations greater than $2^{15}$ cells, which was my first clue.
When using the workflow: FillDepressions --> D8FlowAccumulation(pntr=False)
I get the expected maximum flow accumulation of the raster.
When using either: FillDepressions --> D8Pointer --> D8FlowAccumulation(pntr=True)
or FlowAccumulationFullWorkflow
the maximum accumulation cell is 32767.
I have no experience with rust, but it looks like the FlowAccumulationFullWorkflow
and D8FlowAccumulation
use different accumulation functions, but they appear set the output data type from the input raster. The D8Pointer
output raster is float32, and I can't tell if/where the accumulation function sets the accumulation variable dtype to int16.
From d8_flow_accum.rs
(~line 470):
let mut output = Raster::initialize_using_file(&output_file, &input);
let out_nodata = -32768f32;
output.configs.nodata = out_nodata;
output.configs.photometric_interp = PhotometricInterpretation::Continuous; // if the input is a pointer, this may not be the case by default.
output.configs.data_type = DataType::F32;
output.reinitialize_values(1.0);
drop(input);
The same initialization occurs in flow_accum_full_workflow.rs
, but the output
variable is reinitialized at line ~654 but this time using &accum_file
:
let mut output = Raster::initialize_using_file(&accum_file, &input);
output.reinitialize_values(1.0);
let mut stack = Vec::with_capacity((rows * columns) as usize);
let mut num_solved_cells = 0;
I'm not sure if this is relevant but it doesn't repeat the explicit dtype setting to f32, i.e.:
output.configs.data_type = DataType::F32;
The limited flow accumulation value has a downstream effect on the JensenSnapPourPoints
function because it searches for stream network cells with an expected accumulation value.
Edit: I'm using wbt version 2.1.4 with python 3.8 on Ubuntu 20.
Can you provide the files your using? I can't reproduce on my side when using D8FlowAccumation with the pointer flag. If I provide a 8-bit or 16-bit D8Pointer, my D8 flow accumulation output is always in F32 as it should (that's what the line 476 is doing, it overrides the datatype originally taken from the input file).
Are you sure that your WBT version is 2.1.4 ? The latest release is 2.0.0 (2.1.0 when compiling master 46229fd) and what you are describing was once a bug but Lindsay fixed it in August last year with d5aa7fc.
Thanks for your response.
An example DEM is shared here.
I created a fresh virtual environment (python 3.8) and installed whitebox with pip install whitebox
. After installing, pip show whitebox
shows version 2.1.4 (pypi).
I tested the first case (fill -- flow direction --> flow accumulation) and was not able to replicate the described issue (max flow accumulation of 8204089 cells). However the second case (full_workflow) yielded a maximum flow accumulation value of 32767. Test script below:
import os
import rioxarray as rxr
import numpy as np
from whitebox import WhiteboxTools
wbt = WhiteboxTools()
cwd = os.getcwd()
test_file = '15453500_DEM.tif'
wbt.flow_accumulation_full_workflow(
os.path.join(cwd,test_file),
os.path.join(cwd,'filled.tif'),
os.path.join(cwd,'fdir.tif'),
os.path.join(cwd,'acc.tif'),
out_type="cells",
log=False,
clip=False,
esri_pntr=False,
)
acc = rxr.open_rasterio(os.path.join(cwd,'acc.tif'), masked=True)
max_acc = np.nanmax(acc)
print(max_acc)
Ah ok I see! 2.1.4
is the version of the python frontend which doesn't follow exactly the version number of the backend. To get the version of the binaries you can do:
import whitebox
wbt = whitebox.WhiteboxTools()
print(wbt.version())
I suspect that you was running the newest frontend with an older binaries and by creating a fresh virtual environment, the newer binaries (2.1.0) were downloaded when running wbt = whitebox.WhiteboxTools()
for the first time.
I'll run your DEM to see if I get the same results as you but I can already gives two quick notes your input file:
- it's a lat/lon CRS meaning you might get weird results, but you set the output type to cells so I suppose you know the limits of using a geographic CRS in this case.
- you are using GeoTIFF files which is a format with some issues currently in WBT (see #126). I suggest you convert it first with ArcMap/GDAL to BIL or FLT format to work within WBT.
Hi, thanks so much for the feedback. As you recommended, I checked the binary version and it's 2.1.0. I am familiar with projections but didn't bother for the example, figuring it wouldn't be an issue just to calculate accumulation with cell format output. I haven't been in the habit of reprojecting until I do some process that involves a distance metric, but maybe I should look closer. I will definitely look into the issue you pointed out regarding the GeoTiff format.
So I confirm, you already found the issue inside flow_accum_full_workflow.rs
, it misses output.configs.data_type = DataType::F32;
after line 650 in order to get the same behaviour as with D8FlowAccumulation
. Your input DEM being a 16-bit integer, the accumulation file currently keeps the same datatype.
As for the projection thing, it's because in your case, a cell at the bottom (south) represents approximatively an area of 4500 m² and a cell at the top (north) an area of only 3000 m² (calculations made with EPSG:3578 - NAD83/Yukon Albers). It's quite a gap between the two so you will get distorted accumulation values as having 2 cells on the bottom equals in reality to 3 cells at the top. That behing say, it may not be critical for your application.
Resoved by #270