duckdb_spatial icon indicating copy to clipboard operation
duckdb_spatial copied to clipboard

INTERNAL Error: Attempted to access index n within vector of size n

Open orennia-scott-wang opened this issue 9 months ago • 2 comments

What happens?

Hello,

We encountered this issue in DuckDB 1.11 and believed it was fixed in v1.2.0. However, we can still reproduce the same problem. Environment Info:

DuckDB 1.2.0 spatial extension installed

[
  {
    "extension_name": "spatial",
    "loaded": true,
    "installed": true,
    "install_path": "/root/.duckdb/extensions/v1.2.0/linux_arm64/spatial.duckdb_extension",
    "description": "Geospatial extension that adds support for working with spatial data and functions",
    "aliases": [],
    "extension_version": "79bf2b6",
    "install_mode": "REPOSITORY",
    "installed_from": "core"
  }
]

Similar issue but with status closed. https://github.com/duckdb/duckdb/issues/10950

To Reproduce

To reproduce it, unzip the uploaded parquet file

INSTALL spatial;
LOAD spatial;
create table map_us_native_american_land as select * from 'map_us_native_american_land.parquet';

works without rtree index

SELECT geo_id,name,latitude AS "latitude",longitude AS "longitude",ST_AsGeoJSON(map_layer) as map_layer FROM map_us_native_american_land WHERE ((map_layer_xmax >= -222.44909902948635 AND map_layer_xmin <= -22.126204766600182 AND map_layer_ymax >= -4.221879830812881 AND map_layer_ymin <= 76.82337409623631)) AND ST_CONTAINS( map_layer, ST_GeomFromText( 'POINT(-97.68606911695892 35.9931717004939)' ) ) LIMIT 50;

Cause internal error after rtree index created

CREATE INDEX my_idx ON map_us_native_american_land USING RTREE (map_layer);


SELECT geo_id,name,latitude AS "latitude",longitude AS "longitude",ST_AsGeoJSON(map_layer) as map_layer FROM map_us_native_american_land WHERE ((map_layer_xmax >= -222.44909902948635 AND map_layer_xmin <= -22.126204766600182 AND map_layer_ymax >= -4.221879830812881 AND map_layer_ymin <= 76.82337409623631)) AND ST_CONTAINS( map_layer, ST_GeomFromText( 'POINT(-97.68606911695892 35.9931717004939)' ) ) LIMIT 50;

INTERNAL Error:
Attempted to access index 2 within vector of size 2

Stack Trace:

0        _ZN6duckdb9ExceptionC2ENS_13ExceptionTypeERKNSt3__112basic_stringIcNS2_11char_traitsIcEENS2_9allocatorIcEEEE + 64
1        _ZN6duckdb17InternalExceptionC1ERKNSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEE + 20
2        _ZN6duckdb17InternalExceptionC1IJyyEEERKNSt3__112basic_stringIcNS2_11char_traitsIcEENS2_9allocatorIcEEEEDpT_ + 140
3        _ZN6duckdb18RowGroupCollection5FetchENS_15TransactionDataERNS_9DataChunkERKNS_6vectorINS_12StorageIndexELb1EEERKNS_6VectorEyRNS_16ColumnFetchStateE + 1152
4        _ZN6duckdb9DataTable5FetchERNS_15DuckTransactionERNS_9DataChunkERKNS_6vectorINS_12StorageIndexELb1EEERKNS_6VectorEyRNS_16ColumnFetchStateE + 136
5        duckdb::PhysicalTableScan::GetData(duckdb::ExecutionContext&, duckdb::DataChunk&, duckdb::OperatorSourceInput&) const + 80
6        duckdb::PipelineExecutor::FetchFromSource(duckdb::DataChunk&) + 124
7        duckdb::PipelineExecutor::Execute(unsigned long long) + 236
8        duckdb::PipelineTask::ExecuteTask(duckdb::TaskExecutionMode) + 264
9        duckdb::ExecutorTask::Execute(duckdb::TaskExecutionMode) + 160
10       duckdb::TaskScheduler::ExecuteForever(std::__1::atomic<bool>*) + 620
11       void* std::__1::__thread_proxy[abi:ne180100]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (*)(duckdb::TaskScheduler*, std::__1::atomic<bool>*), duckdb::TaskScheduler*, std::__1::atomic<bool>*>>(void*) + 56
12       _pthread_start + 136
13       thread_start + 8

This error signals an assertion failure within DuckDB. This usually occurs due to unexpected conditions or errors in the program's logic.
For more information, see https://duckdb.org/docs/dev/internal_errors

Let me know if you have any more questions. Thanks.

map_us_native_american_land.parquet.zip

OS:

macos

DuckDB Version:

1.2.0

DuckDB Client:

Duckdb cli, python

Hardware:

No response

Full Name:

Scott Wang

Affiliation:

Orennia Inc.

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include all code required to reproduce the issue?

  • [x] Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • [x] Yes, I have

orennia-scott-wang avatar Feb 13 '25 17:02 orennia-scott-wang

Reduced example

create table map_us_native_american_land as select * from 'map_us_native_american_land.parquet';
CREATE INDEX my_idx ON map_us_native_american_land USING RTREE (map_layer);
SELECT longitude FROM map_us_native_american_land WHERE (map_layer_xmax >= -200 AND map_layer_xmax <= 2) AND ST_CONTAINS( map_layer, ST_GeomFromText( 'POINT(-97.68606911695892 35.9931717004939' ) );

Maxxen avatar Feb 13 '25 20:02 Maxxen

Hello @Maxxen , I ran the same test in DuckDB v1.2.1, the issue still persistent. Is there anything we can do to get around it? Thanks for looking into it.

orennia-scott-wang avatar Mar 06 '25 15:03 orennia-scott-wang