duckdb_spatial icon indicating copy to clipboard operation
duckdb_spatial copied to clipboard

Spatial predicates impacts other filter values

Open gsueur opened this issue 8 months ago • 3 comments

Hi Here is a pretty straightforward SQL query :

SELECT  d.fidji_valeur_fonciere__integer
FROM  dvf.disposition d 
WHERE d.fidji_valeur_fonciere__integer < 50000  
ORDER BY d.fidji_valeur_fonciere__integer DESC ;

with a pretty straightforward result starting with values at 49,999

If I add a spatial filter (either Intersects, Overlaps or contains) I get different results :

SELECT  d.fidji_valeur_fonciere__integer
FROM  dvf.disposition d 
WHERE d.fidji_valeur_fonciere__integer < 50000  
AND  ST_Intersects(ST_MakeEnvelope(1.526264074441798, 45.15464025455313, 1.5394900973676613, 45.16297435294794), d.geom)
ORDER BY d.fidji_valeur_fonciere__integer DESC ;

now the results start with a 65500 value, not respecting the d.fidji_valeur_fonciere__integer < 50000 clause. If I remove the d.fidji_valeur_fonciere__integer < 50000 clause, I have even bigger results as expected. But it's like using the spatial function thwarts the other filter from being executed totally....

gsueur avatar Mar 17 '25 16:03 gsueur

Hello! I am unable to reproduce this behavior. Here's what I've tried:

-- Create a table with 100 diagonal points
CREATE TABLE t1 as SELECT ST_Point(x, x) as g, x as i FROM range(0,100) r(x);

-- Filter
 SELECT * FROM t1 WHERE i > 5 AND st_intersects(g, ST_MakeEnvelope(0, 0, 10, 10)) ORDER BY i DESC;
┌───────────────┬───────┐
│       g       │   i   │
│   geometry    │ int64 │
├───────────────┼───────┤
│ POINT (10 10) │    10 │
│ POINT (9 9)   │     9 │
│ POINT (8 8)   │     8 │
│ POINT (7 7)   │     7 │
│ POINT (6 6)   │     6 │
└───────────────┴───────┘

This all seems correct to me.

  • What version of DuckDB are you using.
  • Can you share the dataset you're working with?
  • What is the query plan for the "wrong" query? I.E. what is the output if you run EXPLAIN <your query>?

Maxxen avatar Apr 02 '25 08:04 Maxxen

Hi Max,

Thanks for taking take of my issue. I've been using last version, currently v1.2.1. I can share the dataset but it's big. I'm currently uploading a parquet file of the table on r2 may it may take some time as I am out of the office with a very bad internet connection. Here is the EXPLAIN statement output :

Physical Plan ┌───────────────────────────┐ │ PROJECTION │ │ ──────────────────── │ │__internal_decompress_integ│ │ ral_bigint(#0, 0) │ │ │ │ ~592481 Rows │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ ORDER_BY │ │ ──────────────────── │ │d.fidji_valeur_fonciere__in│ │ teger DESC │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ PROJECTION │ │ ──────────────────── │ │__internal_compress_integra│ │ l_usmallint(#0, 0) │ │ │ │ ~592481 Rows │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ PROJECTION │ │ ──────────────────── │ │ #0 │ │ │ │ ~592481 Rows │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ FILTER │ │ ──────────────────── │ │ ST_Intersects('\x02\x04 │ │ \x00\x00\x00\x00\x00\x00 │ │ \x9F\x5C\xC3?Z\x9E4B\x03 │ │ \x0E\xC5?\xE3\xA64B\x02 │ │ \x00\x00\x00\x01\x00\x00 │ │ \x00\x05\x00\x00\x00\x00 │ │ \x00\x00\x00\x80\x98\xCC │ │ \xE0\x93k\xF8?\xB4\xF9y@ │ │ \xCB\x93F@\x80\x98\xCC\xE0│ │ \x93k\xF8?\xC4\xFF\xF5W │ │ \xDC\x94F@\x00\x5CK^\xC0 │ │ \xA1\xF8?\xC4\xFF\xF5W\xDC│ │ \x94F@\x00\x5CK^\xC0\xA1 │ │ \xF8?\xB4\xF9y@\xCB\x93F@ │ │ \x80\x98\xCC\xE0\x93k\xF8?│ │ \xB4\xF9y@\xCB\x93F@': │ │ :GEOMETRY, geom) │ │ │ │ ~592481 Rows │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ FILTER │ │ ──────────────────── │ │ (#0 < 50000) │ │ │ │ ~14812040 Rows │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ RTREE_INDEX_SCAN │ │ ──────────────────── │ │ Table: disposition │ │ │ │ Index: │ │ idx_disposition_geom │ │ │ │ Projections: │ │fidji_valeur_fonciere__inte│ │ ger │ │ geom │ │ │ │ ~14812040 Rows │ └───────────────────────────┘

Thanks Guillaume

gsueur avatar Apr 02 '25 17:04 gsueur

As I just ran the queries again to provide the output I don't see the problem happening again. don't bother do anything right now I'll check into it deeper later this week

Cheers

gsueur avatar Apr 02 '25 17:04 gsueur