lantern icon indicating copy to clipboard operation
lantern copied to clipboard

Pick up index dimension on first insert for an empty table

Open therealdarkknight opened this issue 2 years ago • 2 comments

Addresses issue 198 using the approach documented in the discussion of PR 209.

When creating an index with no dimension specified on an empty table, we set the dimension in the index header to 0. Later, during the first insert, we infer the dimension from the inserted first row and then update it in the index header.

therealdarkknight avatar Dec 19 '23 00:12 therealdarkknight

Looks good. Do you think adding this code

isNull = true;
while(isNull) {
    // Get the indexed column out of the row and return it's dimensions
    datum = heap_getattr(tuple, indexCol, RelationGetDescr(heap), &isNull);
    if(!isNull) {
        array = DatumGetArrayTypePCopy(datum);
        n_items = ArrayGetNItems(ARR_NDIM(array), ARR_DIMS(array));
        break;
    }
    tuple = heap_getnext(scan, ForwardScanDirection);
    if(tuple == NULL) {
        heap_endscan(scan);
        return 0;
    }
}

In line 246 makes sense?

Because now this will fail to infer the dimensions:

create table test_emp (v real[]);
insert into test_emp (v) values (NULL),( '{1,1,1}');
create index on test_emp using hnsw(v);

cc: @Ngalstyan4

Yes, great point! I agree with this change. As far as I know, this behavior was present prior to this PR/issue (I just realized I also brought it up earlier, see number 2 in my first PR for this ).

I have some other ideas on how the way we do dimension inference can be refactored-- see number 3 in my first PR above. Since this edit is relevant to index creation on non-empty tables, (and this issue/PR is concerned with index creation on empty tables) I can submit a separate PR with this change and also refactor those functions there.

What do you think?

therealdarkknight avatar Dec 21 '23 21:12 therealdarkknight

Yep that will work! Actually maybe we won't need that as per discussion with Narek we decided that the tradeoff of UX doesn't worth the complexity added (counting edge cases as well)

var77 avatar Dec 22 '23 11:12 var77