riot icon indicating copy to clipboard operation
riot copied to clipboard

Replication fails

Open dhilgarth opened this issue 5 months ago • 0 comments

These are my args:

replicate --mode=scan --compare=QUICK --progress=LOG --log-file=System.out redis://10.0.20.232:6379 redis://seoApi-redis-primary:6379

This is the output:

[SimpleAsyncTaskExecutor-1] ERROR org.springframework.batch.core.step.AbstractStep - Encountered an error executing step replicate-replicate-RedisItemReader in job replicate-replicate-RedisItemReader
java.util.concurrent.ExecutionException: java.net.SocketException: Connection reset
	at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2096)
	at com.redis.lettucemod.RedisModulesUtils.getAll(RedisModulesUtils.java:355)
	at com.redis.spring.batch.item.redis.common.OperationExecutor.execute(OperationExecutor.java:98)
	at com.redis.spring.batch.item.redis.common.OperationExecutor.process(OperationExecutor.java:83)
	at com.redis.spring.batch.item.redis.common.OperationExecutor.process(OperationExecutor.java:26)
	at com.redis.spring.batch.item.BlockingQueueItemWriter.write(BlockingQueueItemWriter.java:47)
	at org.springframework.batch.core.step.item.SimpleChunkProcessor.writeItems(SimpleChunkProcessor.java:203)
	at org.springframework.batch.core.step.item.SimpleChunkProcessor.doWrite(SimpleChunkProcessor.java:170)
	at org.springframework.batch.core.step.item.SimpleChunkProcessor.write(SimpleChunkProcessor.java:297)
	at org.springframework.batch.core.step.item.SimpleChunkProcessor.process(SimpleChunkProcessor.java:227)
	at org.springframework.batch.core.step.item.ChunkOrientedTasklet.execute(ChunkOrientedTasklet.java:75)
	at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:388)
	at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:312)
	at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
	at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:255)
	at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:82)
	at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:369)
	at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:206)
	at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:140)
	at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:240)
	at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:229)
	at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:153)
	at org.springframework.batch.core.job.AbstractJob.handleStep(AbstractJob.java:418)
	at org.springframework.batch.core.job.SimpleJob.doExecute(SimpleJob.java:132)
	at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:317)
	at org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:157)
	at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.net.SocketException: Connection reset
	at java.base/sun.nio.ch.SocketChannelImpl.throwConnectionReset(SocketChannelImpl.java:401)
	at java.base/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:434)
	at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:255)
	at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1132)
	at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:356)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:151)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:994)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	... 1 more

Scanning   0% [                                   ]    0/1240 (0:00:00 / ?) ?/s
Scanning 100% [==========================] 1240/1240 (0:00:14 / 0:00:00) 88.6/s
Scanning 100% [==========================] 1240/1240 (0:00:14 / 0:00:00) 88.6/s

Comparing   0% [                                  ]    0/1240 (0:00:00 / ?) ?/s
Comparing  16% [=         ]  200/1240 (0:00:02 / 0:00:10) 100.0/s | missing 199
Comparing  48% [====      ]  600/1240 (0:00:03 / 0:00:03) 200.0/s | missing 599
Comparing  80% [========  ] 1000/1240 (0:00:04 / 0:00:00) 250.0/s | missing 999
Comparing 100% [=========] 1240/1240 (0:00:04 / 0:00:00) 310.0/s | missing 1239
Comparing 100% [=========] 1240/1240 (0:00:04 / 0:00:00) 310.0/s | missing 1239
Verification failed: missing 1239

I'm confident that the network connectivity isn't the issue as it works for other Redis servers on the same networks. Also, it's always the same error and the verification step always works, meaning it can access both Redis servers.

The source server is AWS ElastiCache Redis 7.1.0 and the target server is a Redis 7.0.7. I've tried v4.1.0, v4.1.3 and early-access. I've also tried 3.2.3 which reports a RedisCommandTimeoutException after 3 minutes, even when I set the timeout to 10 minutes.

What's going on here? And more importantly: how can I fix it?

dhilgarth avatar Sep 17 '24 07:09 dhilgarth