snowplow-rdb-loader
snowplow-rdb-loader copied to clipboard
RDB Loader: fix serializable isolation violation errors
We have tens of occurences of this error for different clients:
com.snowplowanalytics.snowplow.rdbloader.LoaderError$StorageTargetError: Database error: [Amazon](500310) Invalid operation: 1023
Details: Serializable isolation violation on table - 100167, transactions forming the cycle are: 55705701, 55705754 (pid:16606);
at com.snowplowanalytics.snowplow.rdbloader.dsl.JDBC$$anon$1.$anonfun$executeUpdate$1(JDBC.scala:165)
at raiseError @ retry.package$.retry$package$$retryingOnSomeErrorsImpl(package.scala:81)
at map @ com.snowplowanalytics.iglu.client.Client$.parseDefault(Client.scala:56)
at tailRecM @ retry.package$RetryingOnSomeErrorsPartiallyApplied.apply(package.scala:96)
at blockOn$extension @ doobie.free.KleisliInterpreter.$anonfun$primitive$1(kleisliinterpreter.scala:112)
at $anonfun$tailRecM$1 @ doobie.util.transactor$Transactor$$anon$4.$anonfun$apply$4(transactor.scala:167)
at tailRecM @ retry.package$RetryingOnSomeErrorsPartiallyApplied.apply(package.scala:96)
at tailRecM @ retry.package$RetryingOnSomeErrorsPartiallyApplied.apply(package.scala:96)
To me it looks like Loader's COPY transaction races with another client's transaction. Or maybe our retry mechanism somehow runs parallel transaction? If former case - we can handle it with a plain failure handler, retrying the whole loading a bit later.
Found the same (or very similar) issue in our QA pipeline that doesn't have any client transactions. What's interesting is that there's a lot of these exceptions and the pipeline is very tiny volume.
Exception(most recent call first)
com.amazon.support.exceptions.ErrorException: [Amazon](500310) Invalid operation: 1023
Details: Serializable isolation violation on table - 100167, transactions forming the cycle are: 55705701, 55705754 (pid:16606);
at com.amazon.redshift.client.messages.inbound.ErrorResponse.toErrorException
at com.amazon.redshift.client.PGMessagingContext.handleErrorResponse
at com.amazon.redshift.client.PGMessagingContext.handleMessage
at com.amazon.jdbc.communications.InboundMessagesPipeline.getNextMessageOfClass
at com.amazon.redshift.client.PGMessagingContext.doMoveToNextClass
at com.amazon.redshift.client.PGMessagingContext.getErrorResponse
at com.amazon.redshift.client.PGClient.handleErrorsScenario2ForPrepareExecution
at com.amazon.redshift.client.PGClient.handleErrorsPrepareExecute
at com.amazon.redshift.dataengine.CallablePreparedOrAtomicExecuteTask.call
at com.amazon.redshift.dataengine.CallablePreparedOrAtomicExecuteTask.call
at java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.sql.SQLException: [Amazon](500310) Invalid operation: 1023
Details: Serializable isolation violation on table - 100167, transactions forming the cycle are: 55705701, 55705754 (pid:16606);
at com.amazon.redshift.client.messages.inbound.ErrorResponse.toErrorException
at com.amazon.redshift.client.PGMessagingContext.handleErrorResponse
at com.amazon.redshift.client.PGMessagingContext.handleMessage
at com.amazon.jdbc.communications.InboundMessagesPipeline.getNextMessageOfClass
at com.amazon.redshift.client.PGMessagingContext.doMoveToNextClass
at com.amazon.redshift.client.PGMessagingContext.getErrorResponse
at com.amazon.redshift.client.PGClient.handleErrorsScenario2ForPrepareExecution
at com.amazon.redshift.client.PGClient.handleErrorsPrepareExecute
at com.amazon.redshift.dataengine.CallablePreparedOrAtomicExecuteTask.call
at com.amazon.redshift.dataengine.CallablePreparedOrAtomicExecuteTask.call
at java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at blockOn$extension @ doobie.free.KleisliInterpreter.$anonfun$primitive$1(kleisliinterpreter.scala:112)
at $anonfun$tailRecM$1 @ doobie.util.transactor$Transactor$$anon$4.$anonfun$apply$4(transactor.scala:167)
at tailRecM @ retry.package$RetryingOnSomeErrorsPartiallyApplied.apply(package.scala:96)
at $anonfun$tailRecM$1 @ doobie.free.KleisliInterpreter$ConnectionInterpreter.$anonfun$bracketCase$28(kleisliinterpreter.scala:750)
at tailRecM @ retry.package$RetryingOnSomeErrorsPartiallyApplied.apply(package.scala:96)
at $anonfun$tailRecM$1 @ doobie.util.transactor$Transactor$$anon$4.$anonfun$apply$4(transactor.scala:167)
at tailRecM @ retry.package$RetryingOnSomeErrorsPartiallyApplied.apply(package.scala:96)
at $anonfun$tailRecM$1 @ doobie.free.KleisliInterpreter$ConnectionInterpreter.$anonfun$bracketCase$28(kleisliinterpreter.scala:750)
at tailRecM @ retry.package$RetryingOnSomeErrorsPartiallyApplied.apply(package.scala:96)
at $anonfun$tailRecM$1 @ doobie.util.transactor$Transactor$$anon$4.$anonfun$apply$4(transactor.scala:167)
at tailRecM @ retry.package$RetryingOnSomeErrorsPartiallyApplied.apply(package.scala:96)
at $anonfun$tailRecM$1 @ doobie.free.KleisliInterpreter$ConnectionInterpreter.$anonfun$bracketCase$28(kleisliinterpreter.scala:750)
at tailRecM @ retry.package$RetryingOnSomeErrorsPartiallyApplied.apply(package.scala:96)