trino
trino copied to clipboard
Precompute hash values for connector provided distributions
Currently, HashGenerationOptimizer.Rewriter#visitExchange
has a constraint:
// Currently, precomputed hash values are only supported for system hash distributions without constants
It would be beneficial to precompute hash values for connector distributions too (e.g. buckets)
cc @gaurav8297 @raunaqmorarka
That would be beneficial if we use these hashes more than once. Indeed we do -- for distributing data between the nodes and then within the connector page sink. We maybe also distribute between page sinks on a node, if task writer count > 1?
Anyway, for reuse within connector page sink, what would be a proposed SPI to represent that?
We maybe also distribute between page sinks on a node, if task writer count > 1?
That's the idea
Anyway, for reuse within connector page sink, what would be a proposed SPI to represent that?
I don't think a new SPI is needed. I think we just need to use connector bucket function to compute hash (and possibly mangle the hash at remote exchange and local exchange level)
Without a SPI change we still will have hash/partitioning inside the connector. See example in Iceberg https://github.com/trinodb/trino/blob/6310e5e1415fe0bc89444016fc516640c3a3fcbb/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSink.java#L266 This should re-use the precomputed hash values.
Without a SPI change we still will have hash/partitioning inside the connector. See example in Iceberg
page indexer is a different matter as page indexer is local to sink (it enumerates "seen" partitions). It might consume hash, but the biggest cost in page indexer are hash lookups. There is another issue to improve page indexer, see: https://github.com/trinodb/trino/issues/14183