phantom icon indicating copy to clipboard operation
phantom copied to clipboard

Cassandra timeout during read query at consistency ALL

Open serragnoli opened this issue 4 years ago • 0 comments

  • Phantom is the only driver of this project

  • The connector code

object PhantomConnector {

  private val username: String   = cassandra.username
  private val password: String   = cassandra.password
  private val keyspace: String   = cassandra.keyspace
  private val hosts: Seq[String] = cassandra.hosts
  private val port: Int          = cassandra.port.value

  private val databasePooling: PoolingOptions = new PoolingOptions()
    .setCoreConnectionsPerHost(HostDistance.REMOTE, 2)
    .setMaxConnectionsPerHost(HostDistance.REMOTE, 4)
    .setMaxRequestsPerConnection(HostDistance.REMOTE, 2000)
    .setMaxQueueSize(16192)
    .setPoolTimeoutMillis(120000)

  private val cluster: Cluster.Builder = new Cluster.Builder()
    .addContactPoints(hosts: _*)
    .withPort(port)
    .withCredentials(username, password)
    .withPoolingOptions(databasePooling)
    .withoutJMXReporting()
    .withoutMetrics()
    .withSocketOptions(
      new SocketOptions()
        .setReadTimeoutMillis(900000)
        .setConnectTimeoutMillis(900000)
    )

  val connection: CassandraConnection = ContactPoint(port)
    .noHeartbeat()
    .withClusterBuilder(_ => cluster)
    .keySpace(keyspace)
}
  • The consistency level explicitly set to all read queries is LOCAL_QUORUM
    select.where(_.territory eqs territory)
      .and(_.source eqs source)
      .and(_.lookupKey eqs lookupKey)
      .and(_.fragmentType eqs fragmentType)
      .consistencyLevel_=(ConsistencyLevel.LOCAL_QUORUM)
      .fetch()
  • There are sporadic errors related to consistency ALL as below
2020-08-19T18:00:01.739 [cel-api-akka.actor.default-dispatcher-8060] INFO  com.myapp.cel.api.Main$ - Transforming for GB at 1597860001739
2020-08-19T18:00:02.159 [scala-execution-context-global-19974] INFO  c.myapp.cel.api.service.MapperService$ - Classification fetched [62280]
[ERROR] [08/19/2020 18:00:13.574] [cel-api-akka.actor.default-dispatcher-8059] [akka.actor.ActorSystemImpl(cel-api)] Error during processing of request: 'Cassandra timeout during read query at consistency ALL (6 responses were required but only 5 replica responded)'. Completing with 500 Internal Server Error response. To change default exception handling behavior, provide a custom ExceptionHandler.
com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency ALL (6 responses were required but only 5 replica responded)
	at com.datastax.driver.core.exceptions.ReadTimeoutException.copy(ReadTimeoutException.java:124)
	at com.datastax.driver.core.Responses$Error.asException(Responses.java:169)
	at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:646)
	at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1233)
	at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1151)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:297)
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:413)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:748)
Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency ALL (6 responses were required but only 5 replica responded)
	at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:91)
	at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:66)
	at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:297)
	at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:268)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
	... 22 more
  • The Cassandra cluster topology is 2 DCs of 7 nodes each with replication factor of 3
  • Where is Phantom possibly finding query consistency ALL as all reads are explicitly set to LOCAL_QUORUM?

serragnoli avatar Aug 20 '20 13:08 serragnoli