nats-server icon indicating copy to clipboard operation
nats-server copied to clipboard

Internal subscription on `$JS.API.STREAM.MSG.GET.KV_RoleElection` took too long [v2.10.20]

Open Inkathu opened this issue 10 months ago • 2 comments

Hi! We have the configuration with about 40 streams and about 400 consumers. Server version: 2.10.20 Using KV storage for some kind of role election mechanism based on key ttl. Besides the election normal message exchange is in processing. At some point in time, we see timeout exceptions from the client side and as a result problems with the election mechanism(losing the elected role). This happens at 1 node from 40+. This one node is some kind of prime node with a dedicated role(more loaded with messages).

We use one shared connection for each node. And there are my questions:

  1. how to catch the bottleneck? Nats server metrics at the low level - no CPU overload etc.
  2. Are there any recommendations to use connections per some characteristics of the system? (If we have 2-3k messages per second in some part of the system)
  3. After nats cache clean(one of the streams has about 6kk messages) problem go away for a few days. How does cleaning affect this?
  4. How to interpret the following server logs(it's just 1 second!) and what should be done to avoid this kind of messages?

Jan 10, 2025 21:36:41.598 Delaying PING due to remote client data or ping 20s Jan 10, 2025 21:36:41.598 Delaying PING due to remote client data or ping 20s Jan 10, 2025 21:36:41.598 Delaying PING due to remote client data or ping 20s Jan 10, 2025 21:36:41.598 Delaying PING due to remote client data or ping 21s Jan 10, 2025 21:36:41.598 Delaying PING due to remote client data or ping 20s Jan 10, 2025 21:36:41.598 Delaying PING due to remote client data or ping 20s Jan 10, 2025 21:36:41.598 Delaying PING due to remote client data or ping 20s Jan 10, 2025 21:36:41.598 Delaying PING due to remote client data or ping 20s Jan 10, 2025 21:36:41.601 Internal subscription on ""$JS.API.STREAM.MSG.GET.KV_RoleElection"" took too long: 20.198593644s" Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.198619564s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.19866817s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.409116672s Jan 10, 2025 21:36:41.601 Readloop processing time: 20.198681749s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.409129336s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.409149796s Jan 10, 2025 21:36:41.601 Readloop processing time: 20.409167457s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.142541821s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.142559025s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.142583592s Jan 10, 2025 21:36:41.601 Readloop processing time: 20.142594292s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.129221177s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.294897855s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.294911264s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.1290045s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.129012828s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.129026419s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.25366148s Jan 10, 2025 21:36:41.601 Readloop processing time: 20.129034178s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.252105422s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.252112439s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.252129791s Jan 10, 2025 21:36:41.601 Readloop processing time: 20.252138714s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.128372288s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.128379142s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.128390356s Jan 10, 2025 21:36:41.601 Readloop processing time: 20.128397783s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.423745249s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.423756743s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.423784983s Jan 10, 2025 21:36:41.601 Readloop processing time: 20.423791753s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.183092118s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.183700068s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.183726718s Jan 10, 2025 21:36:41.601 Readloop processing time: 20.183737928s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.121914777s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.121924278s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.294947886s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.118178308s Jan 10, 2025 21:36:41.601 Readloop processing time: 20.295158857s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.115785485s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.115793526s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.115812795s Jan 10, 2025 21:36:41.601 Readloop processing time: 20.115821147s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.1155797s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.1155857s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.115599406s Jan 10, 2025 21:36:41.601 Readloop processing time: 20.115607813s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.145124997s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.145829714s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.09682322s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.09683373s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.09686148s Jan 10, 2025 21:36:41.601 Readloop processing time: 20.096870673s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.145856575s Jan 10, 2025 21:36:41.601 Readloop processing time: 20.145889836s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.106260935s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.106317274s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.7955792s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.79090089s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.790911313s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.790971984s Jan 10, 2025 21:36:41.601 Readloop processing time: 19.790981064s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.086430611s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.086438126s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.092201212s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.086459431s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.092223895s Jan 10, 2025 21:36:41.601 Readloop processing time: 20.086468871s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.788859706s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.788866164s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.788881793s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.79559647s Jan 10, 2025 21:36:41.601 Readloop processing time: 19.788889667s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.795673325s Jan 10, 2025 21:36:41.601 Readloop processing time: 19.79569008s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.217486528s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.253672708s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.092251712s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.106341939s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.12194619s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.129231928s Jan 10, 2025 21:36:41.601 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.118185907s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.064773269s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.064781478s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.064804816s Jan 10, 2025 21:36:41.602 Readloop processing time: 20.064820753s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.758202105s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.758209947s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.758254904s Jan 10, 2025 21:36:41.602 Readloop processing time: 19.758265147s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.899577333s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.900919769s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.88905719s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.889066925s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.889085143s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.556807998s Jan 10, 2025 21:36:41.602 Readloop processing time: 19.889094215s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.556819761s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.899583474s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.740206594s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.534190063s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.86360417s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.534200996s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.863612172s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.534233529s Jan 10, 2025 21:36:41.602 Readloop processing time: 19.53424406s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.85401466s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.008799414s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.008812137s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.008837844s Jan 10, 2025 21:36:41.602 Readloop processing time: 20.008850912s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.056681462s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.863632452s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.056765269s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.217987067s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.21801208s Jan 10, 2025 21:36:41.602 Readloop processing time: 20.218023566s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.907249957s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.907260126s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.980997915s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.981027768s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.981050762s Jan 10, 2025 21:36:41.602 Readloop processing time: 19.981077434s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.254357482s Jan 10, 2025 21:36:41.602 Readloop processing time: 20.254372786s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.854027586s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.854176232s Jan 10, 2025 21:36:41.602 Readloop processing time: 19.854186421s Jan 10, 2025 21:36:41.602 Readloop processing time: 20.0926528s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.025183621s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.025439699s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.025463625s Jan 10, 2025 21:36:41.602 Readloop processing time: 20.025470297s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.740218964s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.74052243s Jan 10, 2025 21:36:41.602 Readloop processing time: 19.740533291s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.89963993s Jan 10, 2025 21:36:41.602 Readloop processing time: 19.899950499s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.900936216s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.901293204s Jan 10, 2025 21:36:41.602 Readloop processing time: 19.901300295s Jan 10, 2025 21:36:41.602 Readloop processing time: 20.106862413s Jan 10, 2025 21:36:41.602 Readloop processing time: 20.122650168s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.747497519s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.747912983s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.747931165s Jan 10, 2025 21:36:41.602 Readloop processing time: 19.74793691s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.130191351s Jan 10, 2025 21:36:41.602 Readloop processing time: 20.130198243s Jan 10, 2025 21:36:41.602 Readloop processing time: 19.863685957s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.056789368s Jan 10, 2025 21:36:41.602 Readloop processing time: 20.057156219s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.907291596s Jan 10, 2025 21:36:41.602 Readloop processing time: 19.907643733s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 20.11900823s Jan 10, 2025 21:36:41.602 Readloop processing time: 20.11901473s Jan 10, 2025 21:36:41.602 Internal subscription on "$JS.API.STREAM.MSG.GET.KV_RoleElection" took too long: 19.556853121s Jan 10, 2025 21:36:41.602 Readloop processing time: 19.557375278s

Originally posted by @Inkathu in https://github.com/nats-io/nats-server/discussions/6397

Inkathu avatar Jun 13 '25 12:06 Inkathu

You're on quite an old server version, does this happen with 2.11.4?

neilalexander avatar Jun 13 '25 12:06 neilalexander

sending us the result of nats server request jsz N --all --config --timeout 60s > jsz.json (where N is the number of servers you have) for analysis might help us better understand the state of your system.

wallyqs avatar Jun 13 '25 13:06 wallyqs