kafka-connect-hdfs icon indicating copy to clipboard operation
kafka-connect-hdfs copied to clipboard

java.lang.RuntimeException: org.apache.kafka.connect.errors.SchemaProjectorException: Schema parameters not equal

Open rohitkug opened this issue 7 years ago • 4 comments

We are using HDFS connector v4.0.1 and Kafka Connect 1.0.0

Some data is already written into HDFS in parquet format. The associated schema contains "redacted comment v1" in doc field. After that we evolved schema and changed only the doc string to "redacted comment v2" Now when receiving new records we are facing the following error:

java.lang.RuntimeException: org.apache.kafka.connect.errors.SchemaProjectorException: Schema parameters not equal. source parameters: {connect.record.doc=redacted comment v2 } and target parameters: {connect.record.doc=redacted comment v1} at io.confluent.connect.hdfs.TopicPartitionWriter.write(TopicPartitionWriter.java:401) ~[kafka-connect-hdfs-4.0.1.jar:?] at io.confluent.connect.hdfs.DataWriter.write(DataWriter.java:374) ~[kafka-connect-hdfs-4.0.1.jar:?]

I dont think changing doc should break any compatibility. The only workaround is to remove all data in HDFS which is obviously not feasible in production. Any help or guidance on how we can fix this would be highly appreciated

rohitkug avatar Nov 28 '18 08:11 rohitkug

@rohitkug Can you try adding "connect.meta.data" : "false"? This will remove all connect.record.* attributes from the parser's input.

OneCricketeer avatar Dec 03 '18 07:12 OneCricketeer

Had the same issue. @cricket007 What if I need the metadata in the output schema?

johnnycaol avatar Feb 11 '19 15:02 johnnycaol

I think it'll leave the regular doc fields, just ignore the ones I mentioned

OneCricketeer avatar Feb 11 '19 15:02 OneCricketeer

Hey 👋🏻 I'm also getting the same error

org.apache.kafka.connect.errors.SchemaProjectorException: Schema parameters not equal. source parameters: {connect.record.doc=doc v1} and target parameters: {connect.record.doc=doc v2}

with confluent Kafka Connect v5.1.2 even I have "connect.meta.data" : "false" in my connect configuration while using Avro to S3 sink. I see that the issue is not resolved and the line causing the exception still exists in the code(needs validation).

Could you validate if the issue still exists? Is there any possible fix that you can think of?

Also, it appears to me that this is a bug since I'm observing different compatibility expectations between Confluent Schema Registry and Confluent Kafka Connect. The Schema Registry allows registering a change on doc's of the Avro records when the compatibility is set to full, but Kafka Connect throws an exception. I read that the behavior of Schema Registry is the correct one. Is that deduction correct?

(I'm writing to HDFS repo even I'm using S3 sink, but if I'm not mistaken, the issue is the same rooted from a shared part, see line)

mmmeeedddsss avatar Mar 02 '21 16:03 mmmeeedddsss