Will Manning

Results 19 comments of Will Manning

(In particular, it gives good estimates of cardinality of arbitrary *combinations of attributes* rather than just attributes, which is cool / handy for compound join keys)

if we're taking off the shelf, this crate looks potentially better: https://github.com/cloudflare/cardinality-estimator/tree/main

The historical reason for this was that compressed validity ended up being extremely slow in-memory, was trashing query performance. I think with the operator stuff, we're in a better place...

From Vortex slack, @AdamGS suggested: > I wonder how good we can compress it if we "transmute" the underlying bit buffer into some integer This is actually a great idea...

For UUID specifically, v4 has a hashed namespace (which is often presumably constant in a column), and then the rest is random bytes (incompressible, probably, ish) V7 is better bc...

FixedSizedList doesn't actually increase DType size (which was already 16 and remains 16)

Alternatively (and much easier since GH does the diffing vs baseline), we could also just take the geometric mean of query runtimes per engine-suite pair.

I think this is clearly fine for the statistic, since we do the same for Min/Max. It's less clear that it's valid for the compute function, which should arguably be...