camel-kafka-connector
camel-kafka-connector copied to clipboard
Implicit unmarshalling in Cassandra Sink Connector Kamelet descriptor
The Cassandra Sink Connector Kamelet descriptor is configured by default with the Jackson unmarshalling step:
spec:
...
template:
from:
uri: "kamelet:source"
steps:
- unmarshal:
json:
library: Jackson
useList: true
The implicit unmarshalling is problematic as it doesn't really fit to the Kafka Connect (converter/transforms) data flow in my opinion, as it requires additional serialization/deserialization steps for the Apache Camel pipeline.
In the case of the Cassandra connector specifically, the problem is more apparent as the JSON array format is not usable for the CQL statement builder. The problem with the JSON format is the data-type ambiguity, for example in the Cassandra you might have int64 column but in the JSON it's just a number that can deserialized either to int32 or int64, depending on the value. There are more problems for example how to pass the timestamp value in the JSON in a reliable way.
Normally, in the Kafka Connector, you would use the transformation plugins to cast the types into a specific format, or use the Struct data type, instead of schema-less JSON. But because of the unmarshalling to Jackson step, the connect record value, even with the right types must still be marshalled into the JSON format, which leads to the type ambiguity problem described above.
IMHO the default Kamelet definitions for all connectors should be stripped out with any implicit marshalling or unmarshalling steps. Alternatively there should be a way to remove this step with the means of connector configuration.
The main problem is just that Kamelets are not exclusively for Kafka Connect Runtime
I can think about change it, but I don't think it's so easy
It would be a breaking change indeed, so I understand the hesitation.
For more intermediate solution, I would at least appreciate the option to remove the default configuration for marshalling with configuration, for example something like: camel.sink.unmarshal: none
.