client_python
client_python copied to clipboard
CollectorRegistry undefined order of unlocking
The CollectorRegistry
uses theading.Lock
for mutexing purposes. It does not have a defined order of unlocks.
In our use-case sometimes three or more actions happen almost simultaneously and sometime this would case them to appear in the wrong order.
The easiest solution in my opinion would be to use threading.Semaphore
instead.
I would even be willing to create a PR, however before that I would like to know whether this would be desired in the first place. As a temporary work around I used a threading.Semaphore
in our exporter on top of CollectorRegistry
this however is largely unnecessary as the locking is implemented twice: once in our exporter and once in the client library.
Can you please explain more about your issue?
Sure, our use-case might a bit of a stretch for a pull-based problem. I wrote a simple exporter using this library to expose the metrics of our CI (buildbot). While one already exists https://github.com/claws/buildbot-prometheus it is not very useful for us.
What we need is the state of the build: running, pending, success, failed, etc. These events come asynchronously via buildbot's master Message Queue. What would quite often happen is that there would three state changes very close of one another say pending -> running -> success within a couple ms.
The pending would acquire the lock for the CollectorRegistry
as would the other two. Then since the order of unlocking is not defined for Locks
the events would end up in the opposite order there fore it would appear as pending -> success -> running and the build would appear to hang on running even when it was not. This would not change until it was run again.
This could be solved by using a Semaphore
instead which has defined order of unlocking in the order of acquiring them.
I understand this situation is more suitable for an event based push system, but for now it is working for us as is with the addition of a Semaphore
on top of CollectorRegistry
What is the performance cost of this change?
I was unable to find a resource on the exact performance difference of Lock and Semaphore in Python.
However the difference should not be that noticeable since a lock is a special case (unary) semaphore. The only difference is that lock only holds boolean value whether it is open or not whereas a semaphore has a counter that increments and decrements. The counter then decides which thread should go into the critical section next.