seatunnel icon indicating copy to clipboard operation
seatunnel copied to clipboard

[Feature][Connector-v2] Add NATS Jetsream connector

Open rucciva opened this issue 1 month ago • 6 comments

Search before asking

  • [x] I had searched in the feature and found no similar feature requirement.

Description

Hello, add NATS and NATS Jetsream sink and source connector

Usage Scenario

we are trying to implement outbox pattern in our microservice and in the needs to stream cdc from outbox tables from multiple database into nats. Since seatunnel have a lean deployment requirement and already support multiple postgres cdc sources, we thought that its the best choice, for this scenario

Related issues

No response

Are you willing to submit a PR?

  • [x] Yes I am willing to submit a PR!

Code of Conduct

rucciva avatar Nov 22 '25 04:11 rucciva

hei @zhangshenghang , can you give some advice? i'm currently trying to decide whether this should support parallelism (with user defined split) or not.

Nats jetstream act like kafka but it doesn't have partitioning. But when we subscribe to a stream, we can manually specify subject to subscribe to. Say we have a stream stream1 which contains subject sub1, sub2.1, sub2.2, we could subscribe to sub1 or sub2.* separately.

My hesitation comes from reading the kafka connector, which separates between

  1. fetching the information regarding the split
  2. the actual fetching of the data.

With manual split in Nats, we don't need to do step 1 but on step 2 we might return nothing after waiting for several time on certain split.

what do you think?

rucciva avatar Nov 26 '25 10:11 rucciva

hei @zhangshenghang , can you give some advice? i'm currently trying to decide whether this should support parallelism (with user defined split) or not.

Nats jetstream act like kafka but it doesn't have partitioning. But when we subscribe to a stream, we can manually specify subject to subscribe to. Say we have a stream stream1 which contains subject sub1, sub2.1, sub2.2, we could subscribe to sub1 or sub2.* separately.

My hesitation comes from reading the kafka connector, which separates between

  1. fetching the information regarding the split
  2. the actual fetching of the data.

With manual split in Nats, we don't need to do step 1 but on step 2 we might return nothing after waiting for several time on certain split.

what do you think?

The implementation of NATS should be more convenient. It supports the QueueSubscribe method and does not need to be processed based on partitionNum like Kafka. When parallelism is set, each parallel instance can consume all subjects, namely: sub1, sub2.1, sub2.2

refer : https://docs.nats.io/using-nats/developer/receiving/queues

zhangshenghang avatar Dec 04 '25 05:12 zhangshenghang

So we dont use any rule of to split then. We just need to create subscriber as much as the number of parallelism defined in the config?

rucciva avatar Dec 04 '25 07:12 rucciva

So we dont use any rule of to split then. We just need to create subscriber as much as the number of parallelism defined in the config?

Yes, I think it can be done this way. Do you have a better suggestion?

zhangshenghang avatar Dec 04 '25 13:12 zhangshenghang

Nope, i think its simpler and better that way. Thanks a lot for the suggestion. I'll get back on it.

rucciva avatar Dec 04 '25 13:12 rucciva

Nope, i think its simpler and better that way. Thanks a lot for the suggestion. I'll get back on it.

Thank you for your contribution @rucciva

zhangshenghang avatar Dec 04 '25 14:12 zhangshenghang