cloudberry
cloudberry copied to clipboard
[Bug] The result of query with geometric operators(polygon type) always be wrong
Apache Cloudberry version
main
What happened
sql - create POINT_TBL
CREATE TABLE POINT_TBL(f1 point);
INSERT INTO POINT_TBL(f1) VALUES ('(0.0,0.0)');
INSERT INTO POINT_TBL(f1) VALUES ('(-10.0,0.0)');
INSERT INTO POINT_TBL(f1) VALUES ('(-3.0,4.0)');
INSERT INTO POINT_TBL(f1) VALUES ('(5.1, 34.5)');
INSERT INTO POINT_TBL(f1) VALUES ('(-5.0,-12.0)');
INSERT INTO POINT_TBL(f1) VALUES ('(1e-300,-1e-300)');
INSERT INTO POINT_TBL(f1) VALUES ('(1e+300,Inf)');
INSERT INTO POINT_TBL(f1) VALUES ('(Inf,1e+300)');
INSERT INTO POINT_TBL(f1) VALUES (' ( Nan , NaN ) ');
INSERT INTO POINT_TBL(f1) VALUES ('10.0,10.0');
INSERT INTO POINT_TBL(f1) VALUES (NULL);
SELECT * FROM POINT_TBL;
and the query with polygon expression
SELECT count(*) FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
the result of SeqScan and IndexOnlyScan/IndexScan is different.
postgres=# set enable_indexscan to off;
SET
postgres=# set enable_indexonlyscan to off;
SET
postgres=# set enable_seqscan to on;
SET
postgres=# explain SELECT count(*) FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
QUERY PLAN
------------------------------------------------------------------------------------------
Aggregate (cost=1.07..1.08 rows=1 width=8)
-> Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..1.07 rows=1 width=0)
-> Seq Scan on point_tbl (cost=0.00..1.05 rows=1 width=0)
Filter: (f1 <@ '((0,0),(0,100),(100,100),(50,50),(100,0),(0,0))'::polygon)
Optimizer: Postgres query optimizer
(5 rows)
postgres=# SELECT count(*) FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
count
-------
5
(1 row)
postgres=#
postgres=# set enable_seqscan to off;
SET
postgres=# set enable_indexscan to on;
SET
postgres=# set enable_indexonlyscan to on;
SET
postgres=# explain SELECT count(*) FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
QUERY PLAN
----------------------------------------------------------------------------------------------
Aggregate (cost=8.17..8.18 rows=1 width=8)
-> Gather Motion 3:1 (slice1; segments: 3) (cost=0.13..8.17 rows=1 width=0)
-> Index Only Scan using gpointind on point_tbl (cost=0.13..8.15 rows=1 width=0)
Index Cond: (f1 <@ '((0,0),(0,100),(100,100),(50,50),(100,0),(0,0))'::polygon)
Optimizer: Postgres query optimizer
(5 rows)
postgres=# SELECT count(*) FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
count
-------
4
(1 row)
Also i found another problem: The point ((1e-300,-1e-300)) always in the result.
postgres=# set enable_indexscan to off;
SET
postgres=# set enable_indexonlyscan to off;
SET
postgres=# set enable_seqscan to on;
SET
postgres=# SELECT * FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
f1
------------------
(1e-300,-1e-300)
(NaN,NaN)
(0,0)
(5.1,34.5)
(10,10)
(5 rows)
postgres=# set enable_seqscan to off;
SET
postgres=# set enable_indexscan to on;
SET
postgres=# set enable_indexonlyscan to on;
SET
postgres=# SELECT * FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
f1
------------------
(0,0)
(5.1,34.5)
(10,10)
(1e-300,-1e-300)
(4 rows)
i guess the polygon looks like(not sure):
y
↑
| (0,100) *───────────────────────* (100,100)
| │ /
| │ /
| │ /
| │ /
| │ /
| │ /
| │ /
| │ (50,50) *
| │ \
| │ \
| │ \
| │ \
| │ \
| │ \
| │ \
| (0,0) *──┼───────────────────────* (100,0)
|
|
└───────────────────────────────────> x
And the point (1e-300,-1e-300) should be left of the line ((0,0) , (0,100)), because its Y(-1e-300) is a neg value.
What you think should happen instead
No response
How to reproduce
nope
Operating System
all
Anything else
No response
Are you willing to submit PR?
- [ ] Yes, I am willing to submit a PR!
Code of Conduct
- [x] I agree to follow this project's Code of Conduct.
Hi, @jiaqizho welcome!🎊 Thanks for taking the time to point this out.🙌
Interesting. I test this on Postgres, it returned 5 rows too if disable using indexes including bitmap.
explain SELECT * FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
QUERY PLAN
----------------------------------------------------------------------------------
Index Only Scan using gpointind on point_tbl (cost=0.13..8.15 rows=1 width=16)
Index Cond: (f1 <@ '((0,0),(0,100),(100,100),(50,50),(100,0),(0,0))'::polygon)
(2 rows)
SELECT * FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
f1
------------------
(1e-300,-1e-300)
(0,0)
(10,10)
(5.1,34.5)
(4 rows)
set enable_indexonlyscan = off;
set enable_indexscan = off;
set enable_bitmapscan = off;
explain(costs off, verbose) SELECT * FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
QUERY PLAN
----------------------------------------------------------------------------------------
Seq Scan on tpcds.point_tbl
Disabled: true
Output: f1
Filter: (point_tbl.f1 <@ '((0,0),(0,100),(100,100),(50,50),(100,0),(0,0))'::polygon)
(4 rows)
SELECT * FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
f1
------------------
(0,0)
(5.1,34.5)
(1e-300,-1e-300)
(NaN,NaN)
(10,10)
(5 rows