Andrew Lamb
Andrew Lamb
I wonder if we could do some sort of hybrid approach where `ScalarValue` still retains a DataType, type, but it isn't encoded in the variant like ```rust enum ScalarValue {...
> Otherwise, IMO this is just a code simplification (Still a nice step forward) but not the goal of decoupling types https://github.com/apache/datafusion/issues/11513 , Yes I agree with this assesment 🤔...
To make it clear this PR is not waiting on more review (we are discussing on https://github.com/apache/datafusion/pull/12536) marking this as draft
I agree with @findepi that a more general solution to this problem is to extend the `VALUES` coercion to pick the widest super type that works for all values in...
One observation here is that `min` and `max` on strings is not that common of an operation from what it seems -- grouping on strings is more common. Maybe there...
> @alamb is there anyone working on this + is this issue still relevant? I would love to tackle it as it seems like an interesting feature/optimization. I dont know...
It actually turns out that `Min` / `Max` on string/binary columns are in several ClickBench queries: https://github.com/apache/datafusion/blob/a08f923c2acb1a46614970231d9a672c36ce3ad2/benchmarks/queries/clickbench/queries.sql#L22-L23 https://github.com/apache/datafusion/blob/a08f923c2acb1a46614970231d9a672c36ce3ad2/benchmarks/queries/clickbench/queries.sql#L29 I don't think `min`/`max` have appeared as priorities in benchmarking before because...
## Background (I will make a PR shortly to add this to the actual datafusion docs) [`GroupsAccumulator`](https://docs.rs/datafusion/latest/datafusion/logical_expr/trait.GroupsAccumulator.html) logically does this: ``` ┌─────┐ │ 0 │───────────▶ "A" ├─────┤ │ 1 │───────────▶...
# Potential Design One high level idea is to build a data structure that uses the same internal format (views/buffers) as [`StringViewArray`](https://docs.rs/arrow/latest/arrow/array/struct.GenericByteViewArray.html) in Arrow: ``` ┌───────────────────────────────────────────┐ │ Stored in Vec...
> @alamb is there anyone working on this + is this issue still relevant? I would love to tackle it as it seems like an interesting feature/optimization. @devanbenz I think...