junos_exporter icon indicating copy to clipboard operation
junos_exporter copied to clipboard

remove connection and query deadlocks

Open lethalwp opened this issue 2 years ago • 2 comments

By removing these locks, i fixed one of my performance issue: When i uses prometheus with junos_exporter on a poll rate of 1 minute with 20 devices, it worked perfectly.

But when i tried to use it with 250 devices, i had a quick rampup of non reply, devices looking unresponsive, the http://junosurl/metrics took minutes to reply with a duration metric of some seconds when the answer came.

At each prometheus/junos exporter restart, I had to make a "rampup" to bypass: adjust the polling rate&timeout at 5 minutes, waiting that every device got connected, and then change the configuration to lower the polling rate at 1 minute.

Removing those deadlocks fixed this issue for me. Especially the newconnection one. I'm not sure why they were needed.

lethalwp avatar May 18 '22 07:05 lethalwp

  • also when the timeout happened, i quickly had a "no free socket" error (1024 opened sockets).

Whith this lock removal, the opened sockets remain at ~250

lethalwp avatar May 18 '22 07:05 lethalwp

i've tested running a long-running-query on a device without the lock, and while running, ran a second query. Both came out correctly. I don't think the locks are needed? What do you think?

lethalwp avatar May 23 '22 09:05 lethalwp