camel-kafka-connector icon indicating copy to clipboard operation
camel-kafka-connector copied to clipboard

Should processor configuration for source connectors be added?

Open ffang opened this issue 3 years ago • 11 comments

Source connectors basically will start a consumer endpoint "from uri", but currently no way to add a processor class after from.

The processor is really flexible and can do pretty much everything on the camel exchange before send into kafka, and I believe it's good if we can specify processor for source connectors

ffang avatar Feb 19 '21 19:02 ffang

Open a PR here https://github.com/apache/camel-kafka-connector/pull/1045

ffang avatar Feb 19 '21 19:02 ffang

This is out of the roadmap. In camel-kafka-connector there is no need for processors. If a user need a processor then he should use camel directly. Camel is the engine.

oscerd avatar Feb 19 '21 19:02 oscerd

Hi @oscerd ,

I may miss something, but could you please elaborate how to use camel directly? Because if I read the CKC code correctly, when initialising a source connector, the consumer endpoint is created(from) automatically, in this case how can I use processor if I do want to add something for the consumer endpoint?

Thanks! Freeman

ffang avatar Feb 19 '21 19:02 ffang

Basically the idea of ckc is avoid exposing Camel details and creating a tiny abstraction layers between kafka connect and camel. To use a processor people should know at least a bit of camel and I fear that starting to add too much camel concepts in the project, will make it just a mega camel wrapper.

Btw let's discuss this with the community.

@valdar @orpiske etc. Please give your feedback :-)

oscerd avatar Feb 19 '21 19:02 oscerd

My understanding is in line w/ @oscerd's explanation.

Personally, I'd prefer the project to abstract away some Camel features and patterns in order to offer a simple "Kafka connect way" for the users to plug other systems into their connectors. Although we may need to leave certain features behind - such as processors - my belief is that solutions that are complex enough to require some these features may be better done with Camel Core in the first place.

orpiske avatar Feb 19 '21 19:02 orpiske

If you need to do something like manipulate the message while moving from source to kafka, you should use an SMT or a converter (kafka concepts), introducing processors is not really in the Kafka connect perspective. If SMT and converter are not enough, probably CKC is not the best solution for the user. At that point he should use plain Camel by combining a consumer and camel-kafka producer. We have aggregation and idempotent repository, because these are concepts used also in the Kafka connect world.

oscerd avatar Feb 20 '21 13:02 oscerd

I think it's a great discussion. Personally what @oscerd and @orpiske explained really makes sense: CKC is the complehensive collection of generic Kafka connectors abstracted away from the implementation by Camel. It's a great, focused strategy for the project. Here are some constructive suggestions from me in this perspective:

  1. At the same time, what @ffang expected with CKC shouldn't be uncommon among users, especially for existing Camel users. We should write clearly somewhere this basic philosophy of keeping itself as a thin abstraction layer. The existing basic concepts document doesn't seem to address it well:
    https://camel.apache.org/camel-kafka-connector/latest/basic-concepts.html
  2. Also we should address common questions from users, such as when CKC cannot be the best solution, when we should get away from CKC and directly use Camel Core, what my Camel application (architecture) would be like when we move away from CKC, etc.
  3. Talking about branding, perhaps Camel Kafka Connector might not be the best name for the project, as it might give users a wrong expectation that they could use the full power of Camel framework with it. Probably we should stick to the abbreviation, CKC, as the primary name for the project, implying it rather as the Common Kafka Connectors, or the Collection of Kafka Connectors (like what OKD is trying to do). You could still use Powered by Apache Camel as the second punch-line or subtitle.

tadayosi avatar Feb 21 '21 05:02 tadayosi

I agree on 1. and 2., I disagree a bit on .3, because the connectors work exactly as the camel component behaves and they are generated directly from the components. So Camel should be in the name, because basically we're still using the components, but in the kafka connect way.

There are some documentation issue about this and about use cases.

oscerd avatar Feb 21 '21 15:02 oscerd

Hello, so my opinion on this is aligned with what @orpiske , @oscerd and partially @tadayosi explained so well.

The idea is to abstract away camel details as much as possible. Moreover I am introducing in a PoC the use of Kamelets in ckc and additions to the ckc core makes this transition harder. Ideally once and if we move to use kamelets those processors can be added in the kamelet itself just for the connectors that would need it.

valdar avatar Feb 22 '21 11:02 valdar

I think we should close the related PR and close this discussion then.

oscerd avatar Feb 22 '21 13:02 oscerd

Thanks for all the great feedback! I'm digesting the input now, I may ask more questions later on.

ffang avatar Feb 22 '21 18:02 ffang