Dylan

Results 65 comments of Dylan

> Please use the format [CALCITE-XXXX] describe problem(name) . can add describe problem in [JIRA](https://issues.apache.org/jira/projects/CALCITE/) done

Cool, we can enhance dml by this idea later.

> What's your plan for collect this? > > Table statistics should be in meta, data are flowing through CN, maybe we need to have some stat reporters in CN?...

> Could this be something we implement at state store layer? The data we require seems quite generic. > > In that case, wrt our discussion about insert + update...

For append only table, there exists some efficient incremental methods to maintain statistics. - Table row count: we can use a simple counter is enough. - Column NDV: we can...

Here is an example. Just use a sql to express the statistics what we need. ```sql create materialized view sbtest1_statistic as select count(*) as table_count, approx_count_distinct(id) id_ndv, approx_count_distinct(k) k_ndv, approx_count_distinct(c)...

> > State store is lack of schema, but for a table we need column level statistics. > > Apologies, I meant `StateTable`. I guess you are referring to collecting...

I think we should hold on until we figure out that the benefits of introducing statistics can totally outweigh its costs.

> @chenzl25 Is this completed? From the user perspective, yes. We can still do some optimizations: subquery with union, better shuffle strategy.

Type `Timestampz` is encoded as i64 which is different from `Timestamp`, so we need to fix it.