connectors icon indicating copy to clipboard operation
connectors copied to clipboard

[FEATURE REQUEST] Basic data (i.e. parquet file) writing support

Open scottsand-db opened this issue 3 years ago • 1 comments

Currently, Delta Standalone only supports metadata (i.e. commits to the _delta_log) writing, metadata reading, and very basic data (i.e. parquet files) reading.

There has been growing interest and requests for Delta Standalone to provide data writing as well.

Let's use this issue as a place where users can give more details on the use cases, APIs, and interest in this feature.

scottsand-db avatar Apr 22 '22 16:04 scottsand-db

Here's a rudimentary prototype I'm working on to deserialize Kafka to Delta - current iteration includes a barebones Java Parquet Writer that also commits to Delta: https://github.com/mdrakiburrahman/kafka-delta-ingest-adls/blob/main/src/main/java/com/microsoft/kdi/KDI.java

I think as it stands this should meet a Hello World type scenario for Delta Standalone newcomers. The repo above is already in a VSCode DevContainer so anyone can reproduce it.

mdrakiburrahman avatar Apr 23 '22 02:04 mdrakiburrahman

This repo has been deprecated and the code is moved under connectors module in https://github.com/delta-io/delta repository. Please create the issue in repository https://github.com/delta-io/delta. See delta-io/connectors#556 for details.

vkorukanti avatar Jul 11 '23 17:07 vkorukanti