Brent Gardner

Results 71 comments of Brent Gardner

@zsxwing is this what you were looking for? https://github.com/spaceandtimelabs/docker-spark-deltalake

> Later I'll create a PR for this. @yahoNanJing this intersects work I'm currently working on, so anything you could share would be helpful!

> open up interesting integrations with the rest of the ecosystem. e.g.: 1. substrait test suites that show correct before and after optimization plans that we could get for free...

@yahoNanJing this issue seems related to https://github.com/apache/arrow-datafusion/pull/3311 where we are working towards allowing users to register `delta-rs` tables dynamically through SQL at runtime in Ballista.

> `ObjectStore` in the `ExecutionContext` in order to use it right? I think the problem is that this must happen dynamically in the case of a DataFusion executor in Ballista....

Awesome news! Thank you very much for all the hard work that went into this. My company is very excited to make some PRs in the near future :)

Might `count(*)` be as simple as a stats lookup in Parquet or DeltaLake? Reading a billion values just to count them seems sub-optimal, but that can definitely be addressed with...

> One thing that might be worth clarifying is how the final merge occurs, I presume we would rebase the integration branch and then do a fast forward merge? I'm...

> a reasonable evolution of the process we've effectively adopted thus far I like evolution. 1. Who does it? 2. When do they do it? 3. What's the naming convention?

1. Who: the first person who needs it? 2. When: when they need it? 3. PR to non-master branch `v[next version here]` ?