hydromt
hydromt copied to clipboard
Improvement of catchment delineation speed
Kind of request
Changing existing functionality
Enhancement Description
The current catchment delineation speed is slow and we would like to improve it.
Use case
No response
Additional Context
No response
@DirkEilander do you have time to refine this issue?
Given an outlet location, we now delineate a basin a the grid pixel level for the entire upstream area which gives the most accurate result. By doing this on a subbasin level we can get an approximate delineation much faster, If we combine both by first getting the approximate basin and only delineate at the pixel level within subbasin of the outlet we have a fast and accurate approach. This requires a dataset with predefined polygons of subbasins (and in the metadata of each subbasin its upstream neighboring subbasins).
Reference and link to useful datasets: https://github.com/mheberger/delineator
Setup_region
Implement three methods for basin/subbasin delineation in setup_region:
- basin/subbasin: fully pixel based (local data) (can still be supported with full basin vector to clip vector) --> current one
- basin/subbasin: fully vector based (subbasin vector or full basin vector)
- subbasin: mix pixel based (most downstream subbasin) and vector based (subbasin vector)
During the implementation, partial reading of the data on the fly would be nice: e.g. read first only the attributes to determine all subbasin geometry that actually should be loaded in the end.
If mix method or pixel base (with bbox / index database), how to deal with potential mismatch on the source for dem/flwdir and basin/subbasin shape? E.g. Merit HYDRO DEM and HydroSHEDS basins. Can we do any check (during pixel level delineation find potential missing upstream cells due to wrong shapefile? (always) use a buffer on the geometry before moving to pixel base delineation?
Data requirements
GeoDataFrame of basin/subbasins. For subbasins, should contains data on the upstream basins.
Data to use:
- HydroSHEDS --> has info on downstream basin and main basin but not upstream
- MERIT Basins --> needs transformation because info on upstream basins and downstream one is in the river and not the subbasin file
Ideally we would create a specific data catalog with data directly available from cloud/zenodo etc so that external user can directly delineate basin/subbasin without having to download data first --> separate issue
Implications
Update the setup_basemaps method --> separate issue setup_basemaps: (rename! setup_hydrography? ):
- delineate subbasin/basin with vector and buffer
- load the flwdir and upscale if wanted
- derive exact delineation from the upscaled flwdir
@DirkEilander I checked HydroSHEDS and info on upstream basins is not available but the downstream one is (also in the rivers layers of Merit Basins). So I suggest to have a method based on downstream ID and maybe later we could add based on upstream ID if necessary. HydroSHEDS is also downloadable easily via a link https://data.hydrosheds.org/file/HydroBASINS/standard/hybas_eu_lev12_v1c.zip so I think something that the data catalog of hydromt with cache download should be able to support (can create a placeholder for region and levels). So I will start with this database and a search based on downstream ID.
Thanks for looking into this! The awesome pyflwdir package 😜 can also work with vector based flow directions using the from_dataframe
method. We would still need to add a simple method to get all upstream nodes, similar to the raster based basin
method though. I've created an issue at pyflwdir https://github.com/Deltares/pyflwdir/issues/40
Nice news @DirkEilander ! Then is it better to block this issue until the next release of pyfllwdir?
@DirkEilander and @hboisgon , I understood from our planning toolbox CST that there is an urgency for improving catchment delineation speed. I have two questions:
- Since this issue is blocked until the next release of pyfllwdir, when do you expect to have the next release of pyfllwdir?
- Can you please provide an estimate of the amount of work required (i.e., number of days)?" Thanks in advance!
We discussed this in the sprint planning and decided to postpone this at least until after our v1.alpha release