duckdb_spatial icon indicating copy to clipboard operation
duckdb_spatial copied to clipboard

Spatial join implementation missing for `ST_DWITHIN_SPHEROID` predicate

Open Doekeb opened this issue 4 months ago • 1 comments

Efficient spatial joins are a game-changer! Looks like they are not yet implemented for st_dwithin_spheroid, and this implementation would be super useful. A toy example query might be something like: "find all users, with multiplicity, within a certain radius of all stores."

To demonstrate, here's a simple example:

  • Create two tables of random points that could be interpreted as on the Earth
install spatial;
load spatial;
create or replace table foo as (select st_point(180 * random() - 90, 360 * random() - 180) as foo_point from range(10000));
create or replace table bar as (select st_point(180 * random() - 90, 360 * random() - 180) as bar_point from range(10000));
  • Query 1: "Within" join on flat geometry
select * from foo inner join bar on st_dwithin(foo.foo_point, bar.bar_point, 50);
  • explain shows the join type as SPATIAL_JOIN. explain analyze shows fast performance:
┌────────────────────────────────────────────────┐
│┌──────────────────────────────────────────────┐│
││              Total Time: 0.519s              ││
│└──────────────────────────────────────────────┘│
└────────────────────────────────────────────────┘
┌───────────────────────────┐
│           QUERY           │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│      EXPLAIN_ANALYZE      │
│    ────────────────────   │
│           0 Rows          │
│          (0.00s)          │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│         EXTENSION         │
│    ────────────────────   │
│      Join Type: INNER     │
│                           │
│        Conditions:        │
│   ST_DWithin(foo_point,   ├──────────────┐
│         bar_point)        │              │
│                           │              │
│       10100910 Rows       │              │
│          (0.51s)          │              │
└─────────────┬─────────────┘              │
┌─────────────┴─────────────┐┌─────────────┴─────────────┐
│         TABLE_SCAN        ││         TABLE_SCAN        │
│    ────────────────────   ││    ────────────────────   │
│         Table: foo        ││         Table: bar        │
│   Type: Sequential Scan   ││   Type: Sequential Scan   │
│                           ││                           │
│        Projections:       ││        Projections:       │
│         foo_point         ││         bar_point         │
│                           ││                           │
│         10000 Rows        ││         10000 Rows        │
│          (0.00s)          ││          (0.00s)          │
└───────────────────────────┘└───────────────────────────┘
  • Query 2: "Within" join on spherical geometry (radius chosen to achieve a similar-size result set as the previous query)
select * from foo inner join bar on st_dwithin_spheroid(foo.foo_point, bar.bar_point, 5000000);
  • explain shows the join type as BLOCKWISE_NL_JOIN. explain analyze shows dramatically slower performance despite the similar-sized result set:
┌────────────────────────────────────────────────┐
│┌──────────────────────────────────────────────┐│
││              Total Time: 85.51s              ││
│└──────────────────────────────────────────────┘│
└────────────────────────────────────────────────┘
┌───────────────────────────┐
│           QUERY           │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│      EXPLAIN_ANALYZE      │
│    ────────────────────   │
│           0 Rows          │
│          (0.00s)          │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│     BLOCKWISE_NL_JOIN     │
│    ────────────────────   │
│      Join Type: INNER     │
│                           │
│         Condition:        │
│  ST_DWithin_Spheroid(CAST │
│  (foo_point AS POINT_2D), ├──────────────┐
│      CAST(bar_point AS    │              │
│    POINT_2D), 5000000.0)  │              │
│                           │              │
│       17163875 Rows       │              │
│          (85.49s)         │              │
└─────────────┬─────────────┘              │
┌─────────────┴─────────────┐┌─────────────┴─────────────┐
│         TABLE_SCAN        ││         TABLE_SCAN        │
│    ────────────────────   ││    ────────────────────   │
│         Table: foo        ││         Table: bar        │
│   Type: Sequential Scan   ││   Type: Sequential Scan   │
│                           ││                           │
│        Projections:       ││        Projections:       │
│         foo_point         ││         bar_point         │
│                           ││                           │
│         10000 Rows        ││         10000 Rows        │
│          (0.00s)          ││          (0.00s)          │
└───────────────────────────┘└───────────────────────────┘

Doekeb avatar Sep 09 '25 17:09 Doekeb

Hello! Thanks for opening this issue. Yes, this is currently expected, spheroid/sphere based joins require a more advanced indexing structure to handle wraparound the earth. But it is definitely something we would like to support eventually.

Maxxen avatar Oct 24 '25 10:10 Maxxen