grass
grass copied to clipboard
[Bug] r.fill.dir Segfault/stack overflow with larger datasets
Describe the bug
Large datasets seem to cause a segfault. Investigating the resulting coredump, the stack trace was 100k lines long:
(gdb) bt
#0 0x0000555c25ff41a6 in recurse_list (flag=flag@entry=117042, cells=cells@entry=0x7fa04fa00010, sz=sz@entry=4252788, start=<optimized out>) at ./raster/r.fill.dir/dopolys.c:26
#1 0x0000555c25ff41b5 in recurse_list (flag=flag@entry=117042, cells=cells@entry=0x7fa04fa00010, sz=sz@entry=4252788, start=<optimized out>) at ./raster/r.fill.dir/dopolys.c:26
#2 0x0000555c25ff41b5 in recurse_list (flag=flag@entry=117042, cells=cells@entry=0x7fa04fa00010, sz=sz@entry=4252788, start=<optimized out>) at ./raster/r.fill.dir/dopolys.c:26
...snip...
#104756 0x0000555c25ff41b5 in recurse_list (flag=flag@entry=117042, cells=cells@entry=0x7fa04fa00010, sz=sz@entry=4252788, start=<optimized out>) at ./raster/r.fill.dir/dopolys.c:26
#104757 0x0000555c25ff41b5 in recurse_list (flag=flag@entry=117042, cells=cells@entry=0x7fa04fa00010, sz=sz@entry=4252788, start=<optimized out>) at ./raster/r.fill.dir/dopolys.c:26
#104758 0x0000555c25ff41b5 in recurse_list (flag=flag@entry=117042, cells=cells@entry=0x7fa04fa00010, sz=sz@entry=4252788, start=<optimized out>) at ./raster/r.fill.dir/dopolys.c:26
#104759 0x0000555c25ff43d9 in dopolys (fd=fd@entry=6, fm=fm@entry=7, nl=nl@entry=11011, ns=ns@entry=14056) at ./raster/r.fill.dir/dopolys.c:81
#104760 0x0000555c25ff399c in main (argc=<optimized out>, argv=<optimized out>) at ./raster/r.fill.dir/main.c:210
Seems that https://trac.osgeo.org/grass/ticket/2742 wasn't actually fixed.
To reproduce
- Download a large chunk of DEM data. I used https://maps.vcgi.vermont.gov/opendata/clipandzip.html?InputLayerName=IMG_VCGI_LIDARDEM_SP_v3&InputFtype=raster with a chunk about 3/4 the size of the towns in the bottom right. Mine was about 600MB.
- Extract and import
- Run r.fill.dir with it
- See error
I had sufficient memory, I have about 16GB free, which wasn't used when this was running
Expected behavior
No Crash
Screenshots
Importing raster map <rast_667ee69b044a211>...
0..3..6..9..12..15..18..21..24..27..30..33..36..39..42..45..48..51..54..57..60..63..66..69..72..75..78..81..84..87..90..93..96..99..100
Reading input elevation raster map...
0..3..6..9..12..15..18..21..24..27..30..33..36..39..42..45..48..51..54..57..60..63..66..69..72..75..78..81..84..87..90..93..96..99..100
Filling sinks...
Determining flow directions for ambiguous cases...
Segmentation fault (core dumped)
System description
- Operating System: Debian GNU/Linux 12
- GRASS GIS version: 8.4.1 (Debian)
| QGIS version | 3.22.16-Białowieża - QGIS code branch - Release 3.22 |
|---|---|
| Qt version | 5.15.8 |
| Python version | 3.11.1 |
| GDAL/OGR version | 3.6.2 |
| PROJ version | 9.1.1 |
| EPSG Registry database version | v10.076 (2022-08-31) |
| GEOS version | 3.11.1-CAPI-1.17.1 |
| SQLite version | 3.40.1 |
| PostgreSQL client version | 15.1 (Debian 15.1-1+b1) |
| SpatiaLite version | 5.0.1 |
| QWT version | 6.1.4 |
| QScintilla2 version | 2.13.3 |
| OS version | Debian GNU/Linux 12 (bookworm) |
| Active Python plugins | |
| quick_map_services | 0.19.29 |
| DEMto3D | 3.6 |
| MetaSearch | 0.3.5 |
| db_manager | 0.1.20 |
| grassprovider | 2.12.99 |
| processing | 2.12.99 |
| sagaprovider | 2.12.99 |
As there is no link from the Trac to an actual commit, I'd guess it didn't got fixed. Please provide your computational region parameters (g.region -p) to determine how large is large.
Hmm, I am unsure where or how to use that command, but QGIS's layer properties says:
Total size: 528.95 MB Width: 14056 Height: 11011 Data type: Float32 - Thirty two bit floating point GDAL Driver Description: HFA GDAL Driver Metadata: Erdas Imagine Images (.img) Dimensions: X: 14056 Y: 11011 Bands: 1 Pixel Size: 0.6999999999484479707,-0.6999999999484491919 Name: NAD83 / Vermont Units: meters Method: Transverse Mercator
I can confirm. GRASS 8.4.0, Debian Linux/12, AMD64. GTiff/GeoTIFF, 1467x2658, 1 band, Float32. Runs for about 2 days, then segfaults.
Is it possible to have a smaller reproduction, and that it doesn't take 2 days to trigger? Does constraining the machine ressources in a VM help trigger it?
I know my source (using the vermont gis export link as I describe) was fairly quick (under 5 minutes, I think)