kafka-connect-hdfs
kafka-connect-hdfs copied to clipboard
java.lang.RuntimeException: org.apache.kafka.connect.errors.SchemaProjectorException: Schema parameters not equal
We are using HDFS connector v4.0.1 and Kafka Connect 1.0.0
Some data is already written into HDFS in parquet format. The associated schema contains "redacted comment v1" in doc field. After that we evolved schema and changed only the doc string to "redacted comment v2" Now when receiving new records we are facing the following error:
java.lang.RuntimeException: org.apache.kafka.connect.errors.SchemaProjectorException: Schema parameters not equal. source parameters: {connect.record.doc=redacted comment v2 } and target parameters: {connect.record.doc=redacted comment v1} at io.confluent.connect.hdfs.TopicPartitionWriter.write(TopicPartitionWriter.java:401) ~[kafka-connect-hdfs-4.0.1.jar:?] at io.confluent.connect.hdfs.DataWriter.write(DataWriter.java:374) ~[kafka-connect-hdfs-4.0.1.jar:?]
I dont think changing doc should break any compatibility. The only workaround is to remove all data in HDFS which is obviously not feasible in production. Any help or guidance on how we can fix this would be highly appreciated
@rohitkug Can you try adding "connect.meta.data" : "false"? This will remove all connect.record.* attributes from the parser's input.
Had the same issue. @cricket007 What if I need the metadata in the output schema?
I think it'll leave the regular doc fields, just ignore the ones I mentioned
Hey 👋🏻 I'm also getting the same error
org.apache.kafka.connect.errors.SchemaProjectorException: Schema parameters not equal. source parameters: {connect.record.doc=doc v1} and target parameters: {connect.record.doc=doc v2}
with confluent Kafka Connect v5.1.2 even I have "connect.meta.data" : "false" in my connect configuration while using Avro to S3 sink. I see that the issue is not resolved and the line causing the exception still exists in the code(needs validation).
Could you validate if the issue still exists? Is there any possible fix that you can think of?
Also, it appears to me that this is a bug since I'm observing different compatibility expectations between Confluent Schema Registry and Confluent Kafka Connect. The Schema Registry allows registering a change on doc's of the Avro records when the compatibility is set to full, but Kafka Connect throws an exception. I read that the behavior of Schema Registry is the correct one. Is that deduction correct?
(I'm writing to HDFS repo even I'm using S3 sink, but if I'm not mistaken, the issue is the same rooted from a shared part, see line)