azure-cosmosdb-spark icon indicating copy to clipboard operation
azure-cosmosdb-spark copied to clipboard

Read Data after enabling TTL on container level

Open kiranvsk1 opened this issue 3 years ago • 5 comments

Hi Team,

I am trying to read data from a Cosmos(SQL API) container using the custom query option, resulting in errors.

Setup of the container - Enable TTL with a default value of 1 week(72460*60)

What works -

  1. Can write data to the container using azure-cosmos DB-spark connector
  2. Able to read stats from the container on the portal (count(1) etc..,)

What does not work-

  1. Reads from the container using spark connector end up resulting in a 500 error.

But, if I remove the TTL setting on the container I am able read data using azure-cosmos DB-spark connector

Is this expected behavior with TTL turned on?

kiranvsk1 avatar Jul 19 '21 07:07 kiranvsk1

I am facing same issue. Look like cosmos OLTP connector not working with TTL . I have faced it when TTL on (No default) and on with some seconds set in edit filed .

sajins2005 avatar Jul 22 '21 11:07 sajins2005

Hi, can you tell us which version of the Spark Connector you are using? Also it would be great to see the error details (error message with callstack) of the failure.

Thanks, Fabian

FabianMeiswinkel avatar Jul 22 '21 12:07 FabianMeiswinkel

@FabianMeiswinkel , I am using com.azure.cosmos.spark:azure-cosmos-spark_3-1_2-12:4.2.0. Faced the same issue in com.azure.cosmos.spark:azure-cosmos-spark_3-1_2-12:4.1.0 as well. Azure spark runtime is 7.3 LTS (includes Apache Spark 3.0.1, Scala 2.12)

Stack trace org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, 10.139.64.4, executor 0): {"ClassName":"InternalServerErrorException","userAgent":"azsdk-java-cosmos/4.17.0-beta.1 Linux/5.4.0-1051-azure JRE/1.8.0_282","statusCode":500,"resourceAddress":"rntbd://cdb-ms-prod-eastus1-fd32.documents.azure.com:14054/apps/274509a2-d536-4a09-b0a3-f4fd526feb25/services/57940846-7939-4603-bc4e-e0297e4bd3b6/partitions/6c48721d-54ff-4bcb-8827-26bfca38bfe5/replicas/132707874448886293s/","error":"{"Errors":["An unknown error occurred while processing this request. If the issue persists, please contact Azure Support: http://aka.ms/azure-support"]}","innerErrorMessage":"["An unknown error occurred while processing this request. If the issue persists, please contact Azure Support: http://aka.ms/azure-support"]","causeInfo":null,"responseHeaders":"{x-ms-last-state-change-utc=Thu, 15 Jul 2021 01:51:45.529 GMT, x-ms-request-duration-ms=1.523, x-ms-session-token=0:-1#2302125, lsn=2302125, x-ms-request-charge=1.00, x-ms-schemaversion=1.12, x-ms-transport-request-id=4, x-ms-number-of-read-regions=0, x-ms-activity-id=dc18ad1d-eaef-11eb-ae6b-a915eade79fd, x-ms-xp-role=1, x-ms-global-Committed-lsn=2302124, x-ms-cosmos-llsn=2302125, x-ms-serviceversion= version=2.14.0.0}","cosmosDiagnostics":{"userAgent":"azsdk-java-cosmos/4.17.0-beta.1 Linux/5.4.0-1051-azure JRE/1.8.0_282","requestLatencyInMs":7,"requestStartTimeUTC":"2021-07-22T13:22:27.562Z","requestEndTimeUTC":"2021-07-22T13:22:27.569Z","responseStatisticsList":[{"storeResult":{"storePhysicalAddress":"rntbd://cdb-ms-prod-eastus1-fd32.documents.azure.com:14054/apps/274509a2-d536-4a09-b0a3-f4fd526feb25/services/57940846-7939-4603-bc4e-e0297e4bd3b6/partitions/6c48721d-54ff-4bcb-8827-26bfca38bfe5/replicas/132707874448886293s/","lsn":2302125,"globalCommittedLsn":2302124,"partitionKeyRangeId":"0","isValid":true,"statusCode":500,"subStatusCode":0,"isGone":false,"isNotFound":false,"isInvalidPartition":false,"isThroughputControlRequestRateTooLarge":false,"requestCharge":1.0,"itemLSN":-1,"sessionToken":"-1#2302125","backendLatencyInMs":1.523,"exception":"["An unknown error occurred while processing this request. If the issue persists, please contact Azure Support: http://aka.ms/azure-support"]","transportRequestTimeline":[{"eventName":"created","startTimeUTC":"2021-07-22T13:22:27.563Z","durationInMicroSec":0},{"eventName":"queued","startTimeUTC":"2021-07-22T13:22:27.563Z","durationInMicroSec":0},{"eventName":"channelAcquisitionStarted","startTimeUTC":"2021-07-22T13:22:27.563Z","durationInMicroSec":1000},{"eventName":"pipelined","startTimeUTC":"2021-07-22T13:22:27.564Z","durationInMicroSec":1000},{"eventName":"transitTime","startTimeUTC":"2021-07-22T13:22:27.565Z","durationInMicroSec":4000},{"eventName":"received","startTimeUTC":"2021-07-22T13:22:27.569Z","durationInMicroSec":0},{"eventName":"completed","startTimeUTC":"2021-07-22T13:22:27.569Z","durationInMicroSec":0}],"rntbdRequestLengthInBytes":498,"rntbdResponseLengthInBytes":326,"requestPayloadLengthInBytes":55,"responsePayloadLengthInBytes":null,"channelTaskQueueSize":1,"pendingRequestsCount":1,"serviceEndpointStatistics":{"availableChannels":1,"acquiredChannels":0,"executorTaskQueueSize":0,"inflightRequests":1,"lastSuccessfulRequestTime":"2021-07-22T13:22:26.781Z","lastRequestTime":"2021-07-22T13:22:27.411Z","createdTime":"2021-07-22T13:22:26.765Z","isClosed":false}},"requestResponseTimeUTC":"2021-07-22T13:22:27.569Z","requestResourceType":"Document","requestOperationType":"Query"}],"supplementalResponseStatisticsList":[],"addressResolutionStatistics":{},"regionsContacted":["[REDACTED]"],"retryContext":{"statusAndSubStatusCodes":null,"retryCount":0,"retryLatency":0},"metadataDiagnosticsContext":{"metadataDiagnosticList":null},"serializationDiagnosticsContext":{"serializationDiagnosticsList":null},"gatewayStatistics":null,"systemInformation":{"usedMemory":"202493 KB","availableMemory":"2670339 KB","systemCpuLoad":"empty","availableProcessors":4},"clientCfgs":{"id":0,"connectionMode":"DIRECT","numberOfClients":1,"connCfg":{"rntbd":"(cto:PT5S, rto:PT5S, icto:PT0S, ieto:PT1H, mcpe:130, mrpc:30, cer:false)","gw":"(cps:1000, rto:PT5S, icto:null, p:false)","other":"(ed: true, cs: false)"},"consistencyCfg":"(consistency: Eventual, mm: true, prgns: [])"}}} at azure_cosmos_spark.com.azure.cosmos.implementation.directconnectivity.rntbd.RntbdRequestManager.messageReceived(RntbdRequestManager.java:807) at azure_cosmos_spark.com.azure.cosmos.implementation.directconnectivity.rntbd.RntbdRequestManager.channelRead(RntbdRequestManager.java:181) at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at azure_cosmos_spark.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) at azure_cosmos_spark.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at azure_cosmos_spark.io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436) at azure_cosmos_spark.io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:253) at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at azure_cosmos_spark.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286) at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at azure_cosmos_spark.io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1368) at azure_cosmos_spark.io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1234) at azure_cosmos_spark.io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1280) at azure_cosmos_spark.io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507) at azure_cosmos_spark.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446) at azure_cosmos_spark.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at azure_cosmos_spark.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at azure_cosmos_spark.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) at azure_cosmos_spark.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) at azure_cosmos_spark.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719) at azure_cosmos_spark.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655) at azure_cosmos_spark.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581) at azure_cosmos_spark.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) at azure_cosmos_spark.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at azure_cosmos_spark.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at azure_cosmos_spark.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748)

Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2519) at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2466) at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2460) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2460) at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1152) at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1152) at scala.Option.foreach(Option.scala:407) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1152) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2721) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2668) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2656) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:938) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2339) at org.apache.spark.sql.execution.collect.Collector.runSparkJobs(Collector.scala:298) at org.apache.spark.sql.execution.collect.Collector.collect(Collector.scala:308) at org.apache.spark.sql.execution.collect.Collector$.collect(Collector.scala:82) at org.apache.spark.sql.execution.collect.Collector$.collect(Collector.scala:88) at org.apache.spark.sql.execution.ResultCacheManager.getOrComputeResult(ResultCacheManager.scala:508) at org.apache.spark.sql.execution.CollectLimitExec.executeCollectResult(limit.scala:58) at org.apache.spark.sql.Dataset.collectResult(Dataset.scala:2994) at org.apache.spark.sql.Dataset.$anonfun$collectResult$1(Dataset.scala:2985) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3709) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$5(SQLExecution.scala:116) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:249) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:101) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:845) at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:77) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:199) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3707) at org.apache.spark.sql.Dataset.collectResult(Dataset.scala:2984) at com.databricks.backend.daemon.driver.OutputAggregator$.withOutputAggregation0(OutputAggregator.scala:194) at com.databricks.backend.daemon.driver.OutputAggregator$.withOutputAggregation(OutputAggregator.scala:57) at com.databricks.backend.daemon.driver.PythonDriverLocal.generateTableResult(PythonDriverLocal.scala:1157) at com.databricks.backend.daemon.driver.PythonDriverLocal.$anonfun$getResultBufferInternal$1(PythonDriverLocal.scala:1069) at com.databricks.backend.daemon.driver.PythonDriverLocal.withInterpLock(PythonDriverLocal.scala:856) at com.databricks.backend.daemon.driver.PythonDriverLocal.getResultBufferInternal(PythonDriverLocal.scala:938) at com.databricks.backend.daemon.driver.DriverLocal.getResultBuffer(DriverLocal.scala:538) at com.databricks.backend.daemon.driver.PythonDriverLocal.outputSuccess(PythonDriverLocal.scala:898) at com.databricks.backend.daemon.driver.PythonDriverLocal.$anonfun$repl$8(PythonDriverLocal.scala:383) at com.databricks.backend.daemon.driver.PythonDriverLocal.withInterpLock(PythonDriverLocal.scala:856) at com.databricks.backend.daemon.driver.PythonDriverLocal.repl(PythonDriverLocal.scala:370) at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$10(DriverLocal.scala:431) at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:239) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:234) at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:231) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:48) at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:276) at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:269) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:48) at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:408) at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:653) at scala.util.Try$.apply(Try.scala:213) at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:645) at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:486) at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:598) at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:391) at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:337) at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:219) at java.lang.Thread.run(Thread.java:748)

sajins2005 avatar Jul 22 '21 13:07 sajins2005

Hi, I'm facing the exactly same error here.

I am using:

Databricks Runtime Version: 8.3 (Apache Spark 3.1.1, Scala 2.12) and/or 8.4 (Apache Spark 3.1.2, Scala 2.12) Cosmos DB Spark Connector: com.azure.cosmos.spark:azure-cosmos-spark_3-1_2-12:4.2.0

Stacktrace attached. stacktrace.txt

Thanks, SR

samuelramos avatar Jul 27 '21 15:07 samuelramos

@FabianMeiswinkel
Is there any update on this issue ?

Regards, Sajin

sajins2005 avatar Aug 12 '21 09:08 sajins2005