hazelcast Implement lazy loading of the SQL schema [HZ-3756]

Currently for each optimization the whole SQL catalog is loaded into memory. This might involve a lot of loading when there is a large amount of objects (mappings, views, types, ...).

It should be possible to implement lazy loading. We need to implement org.apache.calcite.schema.Schema. In it we only override Table getTable(String) and Set<String> getTableNames(). For the latter we only need keySet() from our IMap. For the former we can then load the view's or mapping's details. I quickly debugged, Calcite calls getTable only for tables actually encountered in the query, so it should be doable.

Mar 16 '23 21:03 viliam-durina

I don't understand the suggestion to override getTableNames(), HazelcastSchema.java is already implicitly overriding this behavior. It's using own tableMap and overrides getTableMap which is later used in final method of AbstractSchema getTableNames()

  @Override public final Set<String> getTableNames() {
    //noinspection RedundantCast
    return (Set<String>) getTableMap().keySet();
  }

Nov 13 '23 20:11 SpacRocket

Calcite uses getTableNames() but generating it can be quite cheap - only names. On the other hand generating full tableMap is costly, because we create complex representation of the table.

Nov 14 '23 14:11 k-jamroz

Internal Jira issue: HZ-3756

Nov 14 '23 14:11 github-actions[bot]

Related Calcite issue: https://issues.apache.org/jira/browse/CALCITE-5687

Dec 20 '23 17:12 k-jamroz

I checked approach with analyzing query elements and then extracting only them from table resolver. There were a few issues with views, data connections that I managed to solve but a lot of unit tests are failing, it's not very polished as I just wanted to test the approach but maybe it will be useful to someone. https://github.com/SpacRocket/hazelcast/tree/HZ-24004

Mar 03 '24 22:03 SpacRocket