h3-pg icon indicating copy to clipboard operation
h3-pg copied to clipboard

GiST support

Open zachasme opened this issue 5 years ago • 17 comments

Replaces #7. Part of #5.

References:

  • GiST bible
  • PostGIS implementation (gserialized_gist*.* a bit spread over here and here).

I have closed the old PR which contained code for both GiST and SP-GiST, splitting it into two separate PRs. This PR also includes the work by @AbelVM in #40.

The PR builds and runs, but basic tests fail (see test/sql/opclass_gist.sql), so there is some logical error. Any help figuring it out is appreciated. :heart:

We should also consider either 1) remove the unused parts of h3Index.h, 2) simply write out the few macros we need, or 3) get the macros pulled into the public API upstream.

zachasme avatar Aug 24 '20 14:08 zachasme

There is still some logical error in the current implementation, as seen from this simple test:

\set hexagon '\'831c02fffffffff\'::h3index'
CREATE TABLE h3_test_gist (hex h3index);
CREATE INDEX GIST_IDX ON h3_test_gist USING gist(hex);

-- insert immediate children
INSERT INTO h3_test_gist (hex) SELECT h3_to_children(:hexagon);
-- num children is 7
SELECT COUNT(*) FROM h3_test_gist WHERE hex <@ :hexagon;
     7
-- insert deeper children
INSERT INTO h3_test_gist (hex) SELECT h3_to_children(:hexagon, 8);
-- num children is ... ZERO?
SELECT COUNT(*) FROM h3_test_gist WHERE hex <@ :hexagon;
     0  

Any help figuring it out is much appreciated. :heart:

zachasme avatar Aug 24 '20 15:08 zachasme

Using my sql function, with the very same logic I ported to c:

WITH a AS(SELECT h3_to_children('831c02fffffffff'::h3index, 8) as h)
SELECT 
	count(*)
FROM a
WHERE h3_contains('831c02fffffffff'::h3index, a.h);

Returns 16807

Weird ¯\(ツ)

AbelVM avatar Aug 24 '20 15:08 AbelVM

That's the problem. The regular operations work fine, but breaks when using the GiST index as it is currently implemented in this branch:

CREATE TABLE t (hex h3index);
INSERT INTO t (hex) SELECT h3_to_children('831c02fffffffff', 8);
SELECT COUNT(*) FROM t WHERE hex <@ '831c02fffffffff';
 16807

-- after adding gist index
CREATE INDEX idx ON t USING gist(hex);
SELECT COUNT(*) FROM t WHERE hex <@ '831c02fffffffff';
     0


It is the algorithm implemented in opclass_gist.c that is off.

zachasme avatar Aug 26 '20 12:08 zachasme

The logic used in gist_cmp is the same within my custom h3_contains function :thinking:

Maybe something is lost in translation :disappointed: . Does the c function work as expected when run standalone? I will give it some more love this afternoon

AbelVM avatar Aug 26 '20 13:08 AbelVM

Ah sorry for the confusion -- as far as I can tell, your gist_cmp works as intended :+1:

I think the problem lies in the functions

// opclass_gist.c
h3index_gist_consistent
h3index_gist_union
h3index_gist_penalty
h3index_gist_picksplit
h3index_gist_same

which together defines the behavior of our GiST implementation.

zachasme avatar Aug 27 '20 06:08 zachasme

Is work still being done on this pull request or is it abandoned? Having GIST indices would be great.

JohanMollevik avatar Aug 25 '21 14:08 JohanMollevik

I haven't touched the branch in a year, but I don't wish to abandon it. We need to correctly define the GIST operator class methods for the H3 data type. Our efforts so far (opcalss_gist.c in this branch) results in a fail of our basic test suite. Any help would be appreciated (see https://github.com/bytesandbrains/h3-pg/pull/42#issuecomment-681630564)!

zachasme avatar Aug 30 '21 06:08 zachasme

Updated reference: https://www.postgresql.org/docs/13/gist-extensibility.html

AbelVM avatar Sep 08 '21 12:09 AbelVM