schema-registry-transfer-smt icon indicating copy to clipboard operation
schema-registry-transfer-smt copied to clipboard

Mirrormaker 2 updates

Open iturcino opened this issue 4 years ago • 9 comments

I'm working on a project where we need to implement data mirroring with separate schema registry cluster per Kafka cluster, so came across your great Confluent blog and this repo. This is exactly what we need, we are using Mirrormaker 2 for this.

When I added this SMT to MM2 connector (with CP v.5.5.0, Kafka v.2.5.0) it didn't work, I got exception that is already documented here confluentinc/schema-registry#1222

So I updated dependencies, updated code and got it working on Mirror Maker 2

I can make a PR for this, what do you think?

Also, right after this, I'll make changes to #2 since this is needed also in my project, that's going to be another PR :)

iturcino avatar Jun 03 '20 13:06 iturcino

Overall, I think there solution is just removing the Jackson 1.x classes, which I believe is what the Avro + Registry upgrade did.

I'm not sold that the Kafka client version needs changing, but doesn't hurt, I suppose

OneCricketeer avatar Jun 04 '20 18:06 OneCricketeer

Thanks for the feedback.

Actually, this is error I received while running current version on CP 5.5.0 and why I made this update:

Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:186)
java.lang.NoClassDefFoundError: org/codehaus/jackson/JsonParseException
        at cricket.jmoore.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getSchemaByIdFromRegistry(CachedSchemaRegistryClient.java:124)
        at cricket.jmoore.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getBySubjectAndId(CachedSchemaRegistryClient.java:190)
        at cricket.jmoore.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getById(CachedSchemaRegistryClient.java:169)

In recent versions, schema registry dumped org.codehaus.jackson dependency, I don't think removing jackson 1.x (com.fasterxml.jackson package ) would solve this problem.

I guess I could include org.codehaus.jackson in jar, but better solution seemed to update dependencies...

iturcino avatar Jun 04 '20 20:06 iturcino

com.fasterxml.jackson was Jackson 2.x I believe. CodeHaus was Jackson 1.x, that is what I meant by that.

Therefore the move to FasterXML makes sense that a class would not be found.

The alternative solution to your PR would be to have shaded codehaus (Jackson 1.x) along with Avro 1.8.x ... And I have been seeing there is backwards compatibility issues around Avro 1.9.0 ... Not sure if those are resolved in 1.9.2 or not

OneCricketeer avatar Jun 04 '20 23:06 OneCricketeer

Yes, there was some backward compatibility between avro 1.9.0. and 1.8.x but they are fixed from version 1.9.1: https://issues.apache.org/jira/browse/AVRO-2400 https://github.com/confluentinc/schema-registry/issues/1122

iturcino avatar Jun 07 '20 11:06 iturcino

Hi, like iturcino I was looking for exactly the same: using MM2 on confluent 5.5.0 cp. Initially after building the jar file from the master branch I encountered the same java.lang.NoClassDefFoundError: org/codehaus/jackson/JsonParseException error. After building from the pr/26 branch it seems to be running fine. So thanks both! I upvote fro merging this to master. One question though: I am testing this against in a single Kafka cluster so source registry is the same as target. After replicating topic 'avro_test' to 'scr.avro_test' I noticed both topics refer to the same schema id = 1. I do see two subjects: ["src.test_avro-value","test_avro-value"]. Is this expected behaviour?

gerardq avatar Jun 19 '20 12:06 gerardq

Hi @gerardq, sorry, been busy with personal things, but to answer your question, yes, it's expected because the schema text is ran through an MD5 hash, which checks for uniqueness within a single registry. Same hash = same ID. That being said, I'm not sure I understand the use case of replicating within the same cluster when all you really need is a new consumer group from the same topic

OneCricketeer avatar Jun 19 '20 13:06 OneCricketeer

Thanks for quick answer. I was just testing in a single cluster for convenience. I'll do some testing with another cluster but currently it is under construction.

gerardq avatar Jun 19 '20 13:06 gerardq

You may find my blog post helpful for that

https://www.confluent.io/blog/kafka-connect-tutorial-transfer-avro-schemas-across-schema-registry-clusters/

OneCricketeer avatar Jun 19 '20 13:06 OneCricketeer

Thanks, I first found your blog first and then second the code on github.

gerardq avatar Jun 19 '20 14:06 gerardq