mirus icon indicating copy to clipboard operation
mirus copied to clipboard

MirrorMaker migration documentation

Open OneCricketeer opened this issue 5 years ago • 7 comments

Regarding the Medium post

Mirus completely replaced Mirror Maker across all production data-centers at Salesforce in April 2018. Since then our data volumes have continued to grow.

For those who are running mirrormaker and have an active consumer group offset for their data and would prefer not to have duplicates after starting Mirus, is there a migration documentation available, or run-book that Salesforce applied for replacement?

OneCricketeer avatar Oct 08 '18 19:10 OneCricketeer

No documentation available yet, but I will put something together based on our experience at Salesforce.

pdavidson100 avatar Oct 16 '18 18:10 pdavidson100

+1 :-)

mtrienis avatar Nov 26 '18 05:11 mtrienis

@mtrienis Still on my todo list. The short version is that we shut down Mirror Maker, grabbed the Mirror Maker offsets using kafka-consumer-groups.sh , then used bin/mirus-offset-tool.sh with the --reset-offsets and --from-file flags to initialize the Mirus connector offsets. Then, when Mirus started it was able to pick up where Mirror Maker left off with no duplicates.

For the first few clusters we actually left Mirror Maker running in parallel for a few minutes, and accepted the duplicates, just to guarantee everything was running as expected. We still used mirus-offset-tool.sh to initialize our offsets to avoid a flood of duplicates.

pdavidson100 avatar Nov 26 '18 17:11 pdavidson100

Idea:
Could MirusOffsetTool be extended to capture the offset listing functionality of ConsumerGroupCommand so that two scripts wouldn't be needed?

OneCricketeer avatar Jan 11 '19 19:01 OneCricketeer

@pdavidson100 @cricket007 Can please share any sample file or format of the file that we supply to MirusOffsetTool with the flag --from-file for resetting offsets? I'm getting error'ed out with not a valid Long value exception when I try to reset offsets.

Hari4AMQ avatar Jun 10 '19 14:06 Hari4AMQ

@Hari4AMQ The --from-file format is identical to the output format generated by --describe, and supports both CSV and JSON (recommended for setting offsets). For example, if you're setting offsets for a 4 partition topic to 100, then the file format might look like this:

{"connectorId":"connector-id","topic":"topic-name","partition":0,"offset":100}
{"connectorId":"connector-id","topic":"topic-name","partition":1,"offset":100}
{"connectorId":"connector-id","topic":"topic-name","partition":2,"offset":100}
{"connectorId":"connector-id","topic":"topic-name","partition":3,"offset":100}

pdavidson100 avatar Jun 10 '19 16:06 pdavidson100

As @pdavidson100 mentioned, you should use the --describe option first and then edit the output file to the offsets needed of the partitions you want. This command is what I would use to get the offsets for topic t1:

bin/mirus-offset-tool.sh --properties-file config/<worker.properties> --describe  --format json | grep "\"topic\":\"t1\"" > t1-offsets.json

then edit the file t1-offsets.json with the desired offsets.

dalassi1 avatar Jun 10 '19 18:06 dalassi1