Conseil
Conseil copied to clipboard
Conseil API: sudden "request timeout"
As advised in #932 , I built Conseil from master ([commit-hash: 9f4eb4c]
).
Since this change I see configuration-related warnings like:
06:58:42.422 [akka.actor.default-dispatcher-48] WARN t.c.c.a.metadata.UnitTransformation - There're missing metadata overrides for "tezos / carthagenet / blocks / fork_id"
06:58:42.425 [akka.actor.default-dispatcher-48] WARN t.c.c.a.metadata.UnitTransformation - There're missing metadata overrides for "tezos / carthagenet / blocks / invalidated_asof"
And also sudden "blackouts" where the API stops answering, i.e. no response, no 503, nothing to a GET /v2/data/tezos/carthagenet/blocks/head
call.
Logs around the blackout:
...
13:09:54.004 [akka.actor.default-dispatcher-44] INFO t.c.c.a.d.RecordingDirectives - HTTP request
13:09:54.178 [akka.actor.default-dispatcher-44] INFO t.c.c.a.d.RecordingDirectives - HTTP request
13:09:54.179 [akka.actor.default-dispatcher-23] INFO t.c.c.a.d.RecordingDirectives - HTTP request
13:09:54.185 [akka.actor.default-dispatcher-34] INFO t.c.c.a.d.RecordingDirectives - HTTP request
13:10:00.802 [akka.actor.default-dispatcher-25] WARN t.c.c.a.metadata.UnitTransformation - There're missing metadata overrides for "tezos / carthagenet / operations / invalidated_asof"
13:10:00.802 [akka.actor.default-dispatcher-25] WARN t.c.c.a.metadata.UnitTransformation - There're missing metadata overrides for "tezos / carthagenet / operations / fork_id"
13:10:01.125 [akka.actor.default-dispatcher-43] INFO t.c.c.a.d.RecordingDirectives - HTTP request
[INFO] [10/23/2020 13:15:03.812] [conseil-system-akka.actor.default-dispatcher-7418] [akka.actor.ActorSystemImpl(conseil-system)] Request timeout encountered for request [POST /v2/data/tezos/carthagenet/operations Strict(370 bytes)]
[INFO] [10/23/2020 13:15:04.063] [conseil-system-akka.actor.default-dispatcher-7418] [akka.actor.ActorSystemImpl(conseil-system)] Request timeout encountered for request [GET /v2/data/tezos/carthagenet/blocks/head Strict(0 bytes)]
[INFO] [10/23/2020 13:15:04.243] [conseil-system-akka.actor.default-dispatcher-7418] [akka.actor.ActorSystemImpl(conseil-system)] Request timeout encountered for request [GET /v2/data/tezos/carthagenet/blocks/head Strict(0 bytes)]
...
It has happened several times since the update and the pattern is always the same:
HTTP request # Healthy as far as I understand
Request timeout to .../operations # Broken
Request timeout to .../blocks/head # Broken
(I have a process that checks the blocks/head periodically.) I get the metadata config warnings when I restart Conseil and sometimes before the blackout, too - but not always. I am not sure if the two problems are related.
If I restart the API it works again for some time.
What is the CPU utilization of the Tezos node when you experience these Conseil issues? What about CPU utilization overall?
Hello @vishakh , When it is in the "zombie" state it looks like this:
mpstat
12:35:17 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
12:35:17 PM all 1.38 0.00 0.68 0.79 0.00 0.03 0.09 0.00 0.00 97.03
ps -p 2014,7823 -o %cpu,%mem,cmd
%CPU %MEM CMD
19.5 2.5 tezos-node run -v --history-mode=archive --data-dir=tezos --network=carthagenet --rpc-addr=0.0.0.0:8732 --config-file=carthagenet.json --connections=5
3.9 15.4 java -Dconfig.file=/app/conseil/conf/api-carthagenet.conf -jar conseil-api.jar
When I restart conseil-api it gets busier - but only for a minute or two.
OK, I did some more testing. The missing metadata overrides
seems to be unrelated.
I found that this problem occurs if I make very busy calls to /operations and /balance_updates.
What I find really interesting is that when Conseil API freezes then tezos-node (running on the same machine) freezes, too. Do you know why it could happen?
Maybe the Postgres container is being hit hard by the Conseil queries. This reduces the available IO for the Tezos node.
Can you provide us with the queries you are running so we can profile them?
Hi, here are the calls:
{"fields":[],"predicates":[{"field":"operation_group_hash","operation":"in","set":["********************"],"inverse":false},{"field":"kind","operation":"in","set":["reveal","transaction","origination","delegation"],"inverse":false}],"orderBy":[],"aggregation":[]}
2020-11-04 16:25:10,694 TRACE [ Test worker] [client.conseil.TezosConseilAPI] --> END POST (294-byte body)
2020-11-04 16:25:10,741 TRACE [ Test worker] [client.conseil.TezosConseilAPI] <-- 200 OK http://***********************/v2/data/tezos/carthagenet/operations (46ms)
{"fields":[],"predicates":[{"field":"operation_group_hash","operation":"in","set":["**********************"],"inverse":false}],"orderBy":[],"aggregation":[]}
2020-11-04 16:25:10,782 TRACE [ Test worker] [client.conseil.TezosConseilAPI] --> END POST (186-byte body)
2020-11-04 16:25:10,808 TRACE [ Test worker] [client.conseil.TezosConseilAPI] <-- 200 OK http://***********************/v2/data/tezos/carthagenet/balance_updates (25ms)
@g574 In both cases the query is optimized since it hits a database index. Now I wonder whether your Postgres container or server is overwhelmed. Could you show us the CPU and memory utilization of the Postgres process when the problem queries are running?
@vishakh sorry, I found a bug in my systems, the tezos-node is not affected, only Conseil API. Is the Postgres CPU/memory utilization still relevant?
Can you clarify what you mean by finding a bug?
On Fri, Nov 6, 2020 at 7:49 AM g574 [email protected] wrote:
@vishakh https://github.com/vishakh sorry, I found a bug in my systems, the tezos-node is not affected, only Conseil API. Is the Postgres CPU/memory utilization still relevant?
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/Cryptonomic/Conseil/issues/941#issuecomment-723063654, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHDKW7IIWEBGWVMXGXWJNDSOPWEVANCNFSM4S4TJEOQ .
The way I ran queries against tezos-node was wrong so I believed it stopped responding at the same time as Conseil.
But you are still having issues with Conseil, yes? If yes, please check the Postgres performance when you are having Conseil problems.
On Fri, Nov 6, 2020 at 7:54 AM g574 [email protected] wrote:
The way I ran queries against tezos-node was wrong so I believed it stopped responding at the same time as Conseil.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Cryptonomic/Conseil/issues/941#issuecomment-723065724, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHDKW6YWMTT5TXWZRJD7S3SOPWX3ANCNFSM4S4TJEOQ .
mpstat
:
03:57:54 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
03:57:54 PM all 1.44 0.00 0.69 0.51 0.00 0.03 0.10 0.00 0.00 97.24
ps -eo %cpu,%mem,cmd | grep postgres
:
0.0 0.6 /usr/bin/postgres -D ...
0.0 0.0 postgres: logger process
0.0 22.8 postgres: checkpointer process
0.0 0.5 postgres: writer process
0.0 0.2 postgres: wal writer process
0.0 0.0 postgres: autovacuum launcher process
0.0 0.0 postgres: stats collector process
0.0 0.0 postgres: bgworker: logical replication launcher
2.4 3.8 postgres: ... ... ...(42560) idle
1.4 3.7 postgres: ... ... ...(42566) idle
0.0 0.6 postgres: ... ... ...(42568) idle
0.0 0.0 postgres: ... ... ...(42570) idle
0.0 0.0 postgres: ... ... ...(42588) idle
0.0 0.0 postgres: ... ... ...(42592) idle
0.0 0.0 postgres: ... ... ...(42596) idle
I don't find it too heavy. Can it be related to the high number of "idle" processes?
Very puzzling. So none of the processes seems to be under load while you experience Conseil issues? The host overall is not under any special load?
BTW, would using the Conseil node on https://nautilus.cloud/ work for you?
On Fri, Nov 6, 2020 at 10:08 AM g574 [email protected] wrote:
mpstat:
03:57:54 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle 03:57:54 PM all 1.44 0.00 0.69 0.51 0.00 0.03 0.10 0.00 0.00 97.24
ps -eo %cpu,%mem,cmd | grep postgres:
0.0 0.6 /usr/bin/postgres -D ... 0.0 0.0 postgres: logger process 0.0 22.8 postgres: checkpointer process 0.0 0.5 postgres: writer process 0.0 0.2 postgres: wal writer process 0.0 0.0 postgres: autovacuum launcher process 0.0 0.0 postgres: stats collector process 0.0 0.0 postgres: bgworker: logical replication launcher 2.4 3.8 postgres: ... ... ...(42560) idle 1.4 3.7 postgres: ... ... ...(42566) idle 0.0 0.6 postgres: ... ... ...(42568) idle 0.0 0.0 postgres: ... ... ...(42570) idle 0.0 0.0 postgres: ... ... ...(42588) idle 0.0 0.0 postgres: ... ... ...(42592) idle 0.0 0.0 postgres: ... ... ...(42596) idle
I don't find it too heavy. Can it be related to the high number of "idle" processes?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Cryptonomic/Conseil/issues/941#issuecomment-723129947, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHDKWZANQPUAIGZBBAGVNDSOQGO7ANCNFSM4S4TJEOQ .
Hi @vishakh , sorry for the delay. I found that the tezos-node makes some extreme disk utilization and I/O. Among other problems it may also be the cause of these blackouts. Is it possible to do anything about it on the Conseil end? Does lowering the number of PostgreSQL threads make any sense?
Thanks for the question but unfortunately nautilus.cloud is not an option for me right now.
It sounds like you are just maxing out your IO on the host. The simplest solution is to simply scale up your host. Is that an option for you?
On Thu, Nov 12, 2020 at 3:46 AM g574 [email protected] wrote:
Hi @vishakh https://github.com/vishakh , sorry for the delay. I found that the tezos-node makes some extreme disk utilization and I/O. Among other problems it may also be the cause of these blackouts. Is it possible to do anything about it on the Conseil end? Does lowering the number of PostgreSQL threads make any sense?
Thanks for the question but unfortunately nautilus.cloud is not an option for me right now.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Cryptonomic/Conseil/issues/941#issuecomment-725933769, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHDKW4TROLIEC7HVTXLV43SPOOHDANCNFSM4S4TJEOQ .
Hi @vishakh ,
I managed to solve the I/O problem but I still experience these blackouts.
I started Conseil API with the -verbose
switch and caught this log:
08:07:46.829 [akka.actor.default-dispatcher-30] INFO t.c.c.a.d.RecordingDirectives - HTTP request
08:07:47.825 [-akka.actor.default-dispatcher-3] INFO t.c.c.a.d.RecordingDirectives - HTTP request
08:07:56.888 [akka.actor.default-dispatcher-27] INFO t.c.c.a.d.RecordingDirectives - HTTP request
08:07:57.886 [akka.actor.default-dispatcher-27] INFO t.c.c.a.d.RecordingDirectives - HTTP request
[1894.088s][info][class,load ] akka.http.impl.engine.parsing.HttpHeaderParser$RawHeaderValueParser source: file:/app/conseil/bin/conseil-api.jar
08:07:59.275 [akka.actor.default-dispatcher-27] INFO t.c.c.a.d.RecordingDirectives - HTTP request
[1894.184s][info][class,load ] akka.http.javadsl.model.HttpEntity$Default source: file:/app/conseil/bin/conseil-api.jar
[1894.185s][info][class,load ] akka.http.scaladsl.model.HttpEntity$Default source: file:/app/conseil/bin/conseil-api.jar
[1894.185s][info][class,load ] java.lang.invoke.LambdaForm$DMH/0x000000084058d840 source: java.lang.invoke.LambdaForm
[1894.186s][info][class,load ] akka.http.impl.engine.parsing.HttpMessageParser$$Lambda$2609/0x00000008406f0040 source: akka.http.impl.engine.parsing.HttpMessageParser
[1894.186s][info][class,load ] java.lang.invoke.LambdaForm$DMH/0x000000084058a840 source: java.lang.invoke.LambdaForm
[1894.187s][info][class,load ] java.lang.invoke.LambdaForm$DMH/0x0000000840589c40 source: java.lang.invoke.LambdaForm
[1894.187s][info][class,load ] java.lang.invoke.LambdaForm$MH/0x0000000840588040 source: java.lang.invoke.LambdaForm
[1894.188s][info][class,load ] java.lang.invoke.LambdaForm$DMH/0x0000000840579c40 source: java.lang.invoke.LambdaForm
[1894.188s][info][class,load ] akka.http.impl.engine.parsing.HttpMessageParser$$Lambda$2610/0x00000008406cb840 source: akka.http.impl.engine.parsing.HttpMessageParser
[1894.189s][info][class,load ] java.lang.invoke.LambdaForm$DMH/0x0000000840560840 source: java.lang.invoke.LambdaForm
[1894.189s][info][class,load ] java.lang.invoke.LambdaForm$DMH/0x00000008404f2c40 source: java.lang.invoke.LambdaForm
[1894.189s][info][class,load ] java.lang.invoke.LambdaForm$MH/0x00000008404f3040 source: java.lang.invoke.LambdaForm
[1894.190s][info][class,load ] akka.http.impl.engine.parsing.HttpMessageParser$$Lambda$2611/0x0000000840589040 source: akka.http.impl.engine.parsing.HttpMessageParser
[1894.193s][info][class,load ] akka.stream.stage.GraphStageLogic$SubSourceOutlet source: file:/app/conseil/bin/conseil-api.jar
[1894.194s][info][class,load ] akka.stream.impl.fusing.SubSink$Command source: file:/app/conseil/bin/conseil-api.jar
[1894.195s][info][class,load ] akka.stream.stage.GraphStageLogic$SubSourceOutlet$$Lambda$2612/0x0000000840564840 source: akka.stream.stage.GraphStageLogic$SubSourceOutlet
[1894.196s][info][class,load ] akka.stream.impl.fusing.SubSource source: file:/app/conseil/bin/conseil-api.jar
[1894.198s][info][class,load ] akka.stream.impl.SubscriptionTimeoutException source: file:/app/conseil/bin/conseil-api.jar
[1894.198s][info][class,load ] akka.stream.impl.fusing.SubSource$$anon$13 source: file:/app/conseil/bin/conseil-api.jar
[1894.200s][info][class,load ] akka.http.impl.engine.server.HttpServerBluePrint$PrepareRequests$$anon$1$$anon$2 source: file:/app/conseil/bin/conseil-api.jar
[1894.202s][info][class,load ] akka.http.impl.engine.parsing.HttpMessageParser$$anonfun$1 source: file:/app/conseil/bin/conseil-api.jar
[1894.204s][info][class,load ] akka.http.scaladsl.model.EntityStreamException source: file:/app/conseil/bin/conseil-api.jar
[1894.206s][info][class,load ] akka.http.scaladsl.model.HttpEntity$Default$$Lambda$2613/0x0000000840374040 source: akka.http.scaladsl.model.HttpEntity$Default
[1894.207s][info][class,load ] akka.http.scaladsl.model.HttpEntity$Limitable$ source: file:/app/conseil/bin/conseil-api.jar
[1894.208s][info][class,load ] akka.http.scaladsl.model.HttpEntity$SizeLimit source: file:/app/conseil/bin/conseil-api.jar
[1894.208s][info][class,load ] akka.http.scaladsl.model.HttpEntity$Limitable$$$Lambda$2614/0x0000000840564040 source: akka.http.scaladsl.model.HttpEntity$Limitable$
[1894.209s][info][class,load ] akka.http.scaladsl.model.HttpEntity$Limitable source: file:/app/conseil/bin/conseil-api.jar
[1894.209s][info][class,load ] akka.http.scaladsl.model.HttpEntity$Limitable$$anon$1 source: file:/app/conseil/bin/conseil-api.jar
[1894.212s][info][class,load ] akka.http.impl.util.StreamUtils$$anon$3 source: file:/app/conseil/bin/conseil-api.jar
[1894.212s][info][class,load ] akka.http.impl.util.StreamUtils$$anon$3$$anon$4 source: file:/app/conseil/bin/conseil-api.jar
[1894.215s][info][class,load ] akka.http.impl.util.ToStrict source: file:/app/conseil/bin/conseil-api.jar
[1894.216s][info][class,load ] akka.http.impl.util.ToStrict$$anon$1 source: file:/app/conseil/bin/conseil-api.jar
[1894.217s][info][class,load ] akka.stream.impl.HeadOptionStage source: file:/app/conseil/bin/conseil-api.jar
[1894.218s][info][class,load ] akka.stream.SinkShape$ source: file:/app/conseil/bin/conseil-api.jar
[1894.219s][info][class,load ] akka.stream.scaladsl.Sink$$$Lambda$2615/0x0000000840563840 source: akka.stream.scaladsl.Sink$
[1894.220s][info][class,load ] akka.stream.impl.HeadOptionStage$$anon$2 source: file:/app/conseil/bin/conseil-api.jar
[1894.221s][info][class,load ] akka.stream.scaladsl.Sink$$$Lambda$2616/0x0000000840563040 source: akka.stream.scaladsl.Sink$
[1894.222s][info][class,load ] akka.http.impl.util.ToStrict$$anon$1$$anon$2 source: file:/app/conseil/bin/conseil-api.jar
[1894.223s][info][class,load ] akka.http.impl.util.ToStrict$$anon$1$$anon$3 source: file:/app/conseil/bin/conseil-api.jar
[1894.225s][info][class,load ] akka.stream.stage.TimerGraphStageLogic$$anon$2 source: file:/app/conseil/bin/conseil-api.jar
[1894.225s][info][class,load ] akka.http.scaladsl.server.directives.ExecutionDirectives$$Lambda$2617/0x0000000840560040 source: akka.http.scaladsl.server.directives.ExecutionDirectives
[1894.226s][info][class,load ] tech.cryptonomic.conseil.api.directives.RecordingDirectives$$anonfun$loggingExceptionHandler$1$$Lambda$2618/0x0000000840556040 source: tech.cryptonomic.conseil.api.directives.RecordingDirectives$$anonfun$loggingExceptionHandler$1
[1894.226s][info][class,load ] akka.stream.Attributes$$anonfun$getFirst$1 source: file:/app/conseil/bin/conseil-api.jar
[1894.227s][info][class,load ] tech.cryptonomic.conseil.api.directives.RecordingDirectives$$anonfun$loggingExceptionHandler$1$$Lambda$2619/0x0000000840556840 source: tech.cryptonomic.conseil.api.directives.RecordingDirectives$$anonfun$loggingExceptionHandler$1
[1894.227s][info][class,load ] tech.cryptonomic.conseil.api.directives.RecordingDirectives$$anonfun$loggingExceptionHandler$1$$Lambda$2620/0x0000000840bf1840 source: tech.cryptonomic.conseil.api.directives.RecordingDirectives$$anonfun$loggingExceptionHandler$1
[1894.228s][info][class,load ] akka.stream.actor.ActorSubscriberMessage source: file:/app/conseil/bin/conseil-api.jar
[1894.228s][info][class,load ] akka.stream.impl.fusing.SubSource$$anon$13$$Lambda$2621/0x0000000840557040 source: akka.stream.impl.fusing.SubSource$$anon$13
[1894.228s][info][class,load ] akka.http.scaladsl.marshalling.Marshaller$$$Lambda$2622/0x0000000840557840 source: akka.http.scaladsl.marshalling.Marshaller$
[1894.229s][info][class,load ] akka.stream.impl.fusing.SubSink$RequestOne$ source: file:/app/conseil/bin/conseil-api.jar
[1894.229s][info][class,load ] akka.http.scaladsl.marshalling.Marshaller$$$Lambda$2623/0x0000000840beb440 source: akka.http.scaladsl.marshalling.Marshaller$
[1894.230s][info][class,load ] akka.http.scaladsl.marshalling.Marshalling$WithOpenCharset source: file:/app/conseil/bin/conseil-api.jar
[1894.231s][info][class,load ] akka.http.scaladsl.marshalling.Marshal$$anonfun$$nestedInanonfun$toResponseFor$3$1 source: file:/app/conseil/bin/conseil-api.jar
[1894.233s][info][class,load ] akka.http.impl.engine.rendering.ResponseRenderingContext$CloseRequested$ForceClose$ source: file:/app/conseil/bin/conseil-api.jar
[1894.235s][info][class,load ] akka.stream.actor.ActorSubscriberMessage$OnComplete$ source: file:/app/conseil/bin/conseil-api.jar
[1894.235s][info][class,load ] akka.stream.scaladsl.Sink$$$Lambda$2624/0x0000000840be9040 source: akka.stream.scaladsl.Sink$
[2194.264s][info][class,load ] akka.http.impl.util.package$RichHttpRequest$ source: file:/app/conseil/bin/conseil-api.jar
[2194.267s][info][class,load ] akka.event.Logging$Info$ source: file:/app/conseil/bin/conseil-api.jar
[2194.268s][info][class,load ] akka.event.Logging$Info2 source: file:/app/conseil/bin/conseil-api.jar
[2194.271s][info][class,load ] akka.event.Logging$LogEventWithMarker source: file:/app/conseil/bin/conseil-api.jar
[2194.271s][info][class,load ] akka.event.Logging$Info3 source: file:/app/conseil/bin/conseil-api.jar
[2194.272s][info][class,load ] java.time.format.DateTimePrintContext source: shared objects file
[2194.272s][info][class,load ] java.time.LocalDate$1 source: shared objects file
[2194.272s][info][class,load ] java.time.format.DateTimeFormatterBuilder$3 source: shared objects file
[2194.279s][info][class,load ] scala.io.AnsiColor source: file:/app/conseil/bin/conseil-api.jar
[2194.279s][info][class,load ] scala.DeprecatedConsole source: file:/app/conseil/bin/conseil-api.jar
[2194.279s][info][class,load ] scala.Console$ source: file:/app/conseil/bin/conseil-api.jar
[2194.317s][info][class,load ] sun.nio.cs.US_ASCII$Decoder source: jrt:/java.base
[INFO] [11/23/2020 08:12:59.446] [conseil-system-akka.actor.default-dispatcher-927] [akka.actor.ActorSystemImpl(conseil-system)] Request timeout encountered for request [POST /v2/data/tezos/carthagenet/operations Strict(434 bytes)]
[INFO] [11/23/2020 08:13:06.962] [conseil-system-akka.actor.default-dispatcher-927] [akka.actor.ActorSystemImpl(conseil-system)] Request timeout encountered for request [GET /v2/data/tezos/carthagenet/blocks/head Strict(0 bytes)]
[INFO] [11/23/2020 08:13:07.951] [conseil-system-akka.actor.default-dispatcher-927] [akka.actor.ActorSystemImpl(conseil-system)] Request timeout encountered for request [GET /v2/data/tezos/carthagenet/blocks/head Strict(0 bytes)]
[INFO] [11/23/2020 08:13:19.012] [conseil-system-akka.actor.default-dispatcher-929] [akka.actor.ActorSystemImpl(conseil-system)] Request timeout encountered for request [GET /v2/data/tezos/carthagenet/blocks/head Strict(0 bytes)]
My current config looks like this:
include "metadata.conf"
platforms: [ {
name: tezos
network: carthagenet
enabled: true
node: {
protocol: "http"
hostname: "***********"
port: 8732
pathPrefix: ""
}
}
]
conseil {
hostname: "0.0.0.0"
port: *******
cache-ttl: 15 minutes
max-query-result-size: 100000
high-cardinality-limit: 100
startup-deadline: 5 minutes
security.api-keys {
keys: []
allow-blank: true
}
db {
dataSourceClass: "org.postgresql.ds.PGSimpleDataSource"
properties {
user: "************"
password: "**************"
url: "jdbc:postgresql://***************/****"
reWriteBatchedInserts: true
}
}
}
akka {
http {
dispatcher {
type: "Dispatcher"
executor: "thread-pool-executor"
throughput: 1
thread-pool-executor {
fixed-pool-size: 16
}
}
server {
request-timeout: 5 minutes
idle-timeout: 5 minutes
}
}
}
Does it look correct?
hi @vishakh , I suppose there is some misconfiguration. Do you see any red flag in my config?
@ivanopagano @piotrkosecki Could you please review @g574's configuration above?
I think you could try changing this cache-ttl: 15 minutes
to cache-ttl: 24 hours
. Cache is not needed to be refreshed so frequently - the current value was left there for development purposes.
Other lines look fine for me.
@g574 Please try what @piotrkosecki mentioned and let us know if you still have issues.
Thank you. I have updated the configuration. I will observe my Conseil and let you know the result.
Hello @piotrkosecki , @vishakh Unfortunately I still get those blackouts. My current config is as follows:
include "metadata.conf"
platforms {
tezos {
carthagenet {
node {
protocol: "http",
hostname: "***********"
port: 8732
pathPrefix: ""
}
}
}
}
conseil {
hostname: "0.0.0.0"
port: *****
cache-ttl: 24 hours
max-query-result-size: 100000
high-cardinality-limit: 100
startup-deadline: 5 minutes
security.apiKeys {
keys: []
allow-blank: true
}
db {
dataSourceClass: "org.postgresql.ds.PGSimpleDataSource"
properties {
user: "**********"
password: "**************"
url: "jdbc:postgresql://************"
reWriteBatchedInserts: true
}
}
}
akka {
http {
dispatcher {
type: "Dispatcher"
executor: "thread-pool-executor"
throughput: 1
thread-pool-executor {
fixed-pool-size: 16
}
}
server {
request-timeout: 5 minutes
idle-timeout: 5 minutes
}
}
}
It would be OK to have a timeout now and then but I wonder why it does not recover without a restart.
I have switched to delphinet and the problem persists. I have not experienced it on mainnet yet.
A new release with enhanced logging is almost ready.
On Wed, Jan 13, 2021 at 5:10 AM g574 [email protected] wrote:
I have switched to delphinet and the problem persists. I have not experienced it on mainnet yet.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Cryptonomic/Conseil/issues/941#issuecomment-759346642, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHDKW5NPMLRUHHR3QKHUDDSZVWRRANCNFSM4S4TJEOQ .
Hi, the problem persists on delphinet - currently on 2021-january-release-35:
[INFO] [02/24/2021 10:59:19.907] [conseil-system-akka.actor.default-dispatcher-573] [akka.actor.ActorSystemImpl(conseil-system)] Request timeout encountered for request [GET /v2/data/tezos/delphinet/blocks/head Strict(0 bytes)]
[INFO] [02/24/2021 10:59:19.288] [conseil-system-akka.actor.default-dispatcher-572] [akka.actor.ActorSystemImpl(conseil-system)] Request timeout encountered for request [GET /v2/data/tezos/delphinet/blocks/head Strict(0 bytes)]
[INFO] [02/24/2021 10:59:19.327] [conseil-system-akka.actor.default-dispatcher-573] [akka.actor.ActorSystemImpl(conseil-system)] Request timeout encountered for request [GET /v2/data/tezos/delphinet/blocks/head Strict(0 bytes)]
[INFO] [02/24/2021 10:59:19.907] [conseil-system-akka.actor.default-dispatcher-573] [akka.actor.ActorSystemImpl(conseil-system)] Request timeout encountered for request [GET /v2/data/tezos/delphinet/blocks/head Strict(0 bytes)]
If I try time curl --noproxy '*' http://<address>/v2/data/tezos/delphinet/blocks/head
in hangs forever. Once I restart the api I get a response again after 0.7 seconds.
Iotat, free, top don't show any high load. The network traffic is less than 100 kB per sec.
Can you describe the hardware you are running on? What are the load averages when this issue happens?