datafusion-ballista icon indicating copy to clipboard operation
datafusion-ballista copied to clipboard

Support Deltalake

Open avantgardnerio opened this issue 3 years ago • 3 comments

Which issue does this PR close?

Closes #456.

Described in issue

What changes are included in this PR?

  1. New delta feature that allows registration of delta tables

Are there any user-facing changes?

They can use deltalake formatted tables.

avantgardnerio avatar Oct 26 '22 23:10 avantgardnerio

:wave: I stumbled into this pull request and I'm curious what it takes to move this pul l request forward.

Also I'm not terribly familiar, yet, with ballista configuration, how would things like AWS access keys be accessed or provided to the deltalake crate on the executors when they access delta tables.

rtyler avatar Oct 19 '23 04:10 rtyler

what it takes to move this pul l request forward

It was working internally for my old company. It basically just "died on the vine", because it was waiting on a DeltaLake release I think. Everything should basically just work.

AWS access keys be accessed or provided to the deltalake

The environment variables are defined in the object-store crate, and if you set them it magically just works.

avantgardnerio avatar Oct 19 '23 15:10 avantgardnerio

@avantgardnerio Yeah I figured as such. As a delta-rs maintainer I can assure you we're having more regular releases :smile:

In some experimentation from last night it looks like I have to make sure that the scheduler and all executors are going to have the same credentials in environments.

rtyler avatar Oct 19 '23 16:10 rtyler