connectors
connectors copied to clipboard
Pulsar delta source connector (task1) : pulsar source connector framework and configuration
Motivation
Apache Pulsar is a Cloud-Native Messaging and Event-Streaming Platform. It act as a bridge to connect different systems based on messages.
The Pulsar delta source connector is a Pulsar IO connector for synchronizing data between Delta Lake and Pulsar. It capture data changes from delta lake through DSR and writes data to Pulsar topics.
Subtask: #334
PR1: Basic framework, configuration field and delta record define. PR2: Define deltaReader, which read changes from delta and return parquet row record list PR3: Implement DeltaReaderThread, which will get parquet records from deltaReader and put it into the blocking queue. PR4: Implement source connector checkpoint mechanism PR5. Add Source connector metrics PR6. Add unit tests and integration tests PR7. Add docs
This PR is the first PR for Pulsar delta source connector. It just contains the base code framework and the basic configuration.
This is the design doc. https://docs.google.com/document/d/1J_SNaYW_2uxU3Y5H5klYipZq56prnFR_rIPc-hNIoq4/edit?usp=sharing
@dennyglee @scottsand-db Please help take a look, thanks a lot.
@scottsand-db @dennyglee Would you please help take a look if you have time?
This repo has been deprecated and the code is moved under connectors
module in https://github.com/delta-io/delta repository. Here are the migration steps to recreate this PR in the new repository location.