David Sisson

Results 65 comments of David Sisson

We do want the type to be known so that optimization can be performed on the plan (and that the final types can be computed) so this makes sense. I...

If the behavior is different why not just switch left and right when implementing the join? Is there value in maintaining the source left as left?

> It would be helpful to have example plans for this feature. In the abstract I understand the intent of what we're trying to accomplish, but I think it would...

How does grouping id correspond to the additional output of the aggregation relation described (grouping set) here? https://substrait.io/relations/logical_relations/#aggregate-operation

[Define sideband optimization hints #705](https://github.com/substrait-io/substrait/pull/705) was added to the specification to help support the required feature on the DuckDB side.

@jacques-n Here's a plan along with the steps I used to get DuckDB to spit it out: [Gist showing Duplicate Eliminated Join Plan](https://gist.github.com/EpsilonPrime/d863fc6a955b718668219674dfe28307)

I like the idea of the reusable operators (although I think of them as AggregateDistinct_SaveBuildHashTableKeys and HashEquiJoin_ReadSavedHashTableFromElsewhere). We could add an optimization metadata to HashEquiJoin (where to get the saved...

I haven't added any documentation but do optimization hints as described in #705 seem like they would suffice as a replacement for these two new operators?

I suspect this will be discussed in the Substrait Community meeting this week. There has been some discussion around adding first class support for ordering. If we had ordering ties...

I have a few questions about the proposed test file format mainly stemming from not knowing what the format's intended use would be. - What use cases would this test...