duckdb_spatial
duckdb_spatial copied to clipboard
OOM aggregate / `group by` query on Geometry type
Problem
Good morning! I want this query to run, and perform an aggregate over the GEOMETRY type column. Any help would be greatly appreciated.
Traceback
terminate called recursively
terminate called after throwing an instance of 'duckdb::OutOfMemoryException'
Out of Memory Error: could not allocate block of size 256.0 KiB (49.9 GiB/49.9 GiB used)
alternatively, will get:
zsh killed
Steps to re-create
Running this query against a row table stored in .duckdb file, locally.
SELECT
row_id
ST_MakeValid(ST_Union_agg(geom_column)) AS geometry_column,
array_agg(property_id) AS property_ids
FROM my_geometry_table
GROUP BY row_id
my_geometry_tableis 35GiB on the.duckdbfile and comprises153180272rows.- Hits OOM within a few seconds. I've set a
limit 10clause just to see, and same OOM timeframe. - I've set DuckDB
memory_limitto varying values from512MBup to60GB(maximum memory on machine) - I've set
threads=1as well. - When I remove the
ST_MakeValid(ST_Union_agg())it runs. - I attempted this query then, in two parts, where I just call
array_agg()on theGEOMETRYcolumn, first, and in a subsequent query, perform theST_Union_agg(). Even thearray_agg()gets OOM killed in moments. - The memory/thread settings seem to have no effect.
I can't share the data underneath the query, unfortunately, but wondering if the information provided suffices to diagnose a memory leak or some optimization I am missing.
Environment:
% uname -a
Linux john-XPS-15-9520 6.8.0-40-generic #40~22.04.3-Ubuntu SMP PREEMPT_DYNAMIC Tue Jul 30 17:30:19 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
% duckdb --version
v1.1.2 f680b7d08f
My Machine:
- 64GiB Memory
- 20 CPU
- Dell XPS
Hi! Thanks for opening this issue!
There's couple of things at play here. Neither ST_Union_Agg or array_agg are able to spill intermediate state to disk as they are "variable size" aggregates - they basically keep pointers into an unpredictable amount of memory, so when they are offloaded the disk, only the pointer gets serialized. This is currently a fundamental limitation of how DuckDB (actually im not aware of any system that can spill variable-size aggregates to disk), however in the case of array_agg duckdb will at least track the amount of memory used, which is why you sometimes get the OOM exception instead of the process just getting killed. For ST_Union_Agg, the actual union operation is handled by the GEOS library, which allocates memory outside of DuckDB's control, so DuckDB can't track the memory used by the final result geometry until the aggregate actually finishes.
Depending on how many groups row_id creates, this means that you will have to keep all the unioned geometries in memory at the same time until the full operation finishes, so its unlikely that adjusting the amount of threads DuckDB uses would make any difference. We do have some work in progress in DuckDB core that can make use of partitioning (and eventually sorting) information to avoid keeping aggregate states around if the system knows no more entries will be part of the group, but that's probably a long way out.
In general the spatial union operation is one of the most memory intensive things you can do in geospatial processing. I haven't looked at our implementation in a while so there might be some optimizations we could do that I missed the first time around, but as long as we rely on GEOS I think there will always be some amount of uncontrollable memory usage. Like most things related to spatial I'd like to one day provide our own implementation, but building a overlay engine in particular would require a to a ton of work, and testing to catch all the edge cases. Even then I doubt st_union_agg would ever be immune to memory constraints.
So to summarize, I'll try to find some time to revisit this part of the code again and see if it can be optimized, but its probably always going to cause a lot of memory pressure, and I don't think there is any workarounds currently.
Thank you for both such a prompt and thoughtful reply.
I'm currently using DBT to orchestrate model creation. I'm going to have to employ some sort of dynamic batching feature.
With zero memory limitations and changing the underlying table this CTAS pulls from to only 10_000_000 records, the build completed in 15 seconds and no OOM kill. Great.
Thank you for clearing up my path of approach!
Feel free to mark as closed, if you'd like, as I believe it's a design issue at this point on my end.