Threaded telemetry
This PR improves the telemetry mechanism such that all telemetry-related actions now run in an isolated thread.
Main problem addressed
In the previous version, telemetry actions ran in the main thread, blocking the execution of the actual BayBE code. This was especially noticeable in situations where the endpoint is unreachable (generally for external users or when code was executed without internet connection, e.g. me sitting on the train), resulting in longer turnaround times and often giving the impression that "nothing happens".
Solution
In the new version, the main thread interacts with telemetry only via an event queue, giving an entirely frictionless experience.
Other improvements
- The monolithic telemetry script has been refactored into easy-to-digest pieces. Furthermore, the separation into threads results an overall cleaner architecture, which also offers more flexibility. For example, things could now be easily extended to disable telemetry only temporarily, if needed.
- Telemetry packages are now loaded lazily, avoiding imports altogether if connection is not possible.
- Telemetry functionality has been split into "public" and "private" parts, providing a clean API that can be used inside BayBE code (while keeping the aforementioned benefit of lazy imports).
@AVHopp @Scienfitz: Here a little quality-of-life PR solving a problem that kept annoying me since quite a while. Note: I still need to run a final test once the telemetry sync is back up. But we can already prepare the PR for merge in the meantime.
before I review this: can you elaborate how telemetry so far could be blocking? As far as I understand all requests handled by the telemetry dependencies are already done asynchonously and even in the scenario described (going from a zone of previously established connectivity to a no-connectivity zone) had onlye ver resulted in prints, never blocking of code execution.
@AdrianSosic just as reminder, we need to perform a minimal check whether the changed implementation still submits telemetry as expected. We can do this together in a short meeting where I check out the factory while you trigger telemetry simulation If that works then Im happy for review and merge of this
@AdrianSosic
are you certain everything was pushed? It seems you haven't even touched the script that was intended to test telemetry tests/simulate_telemetry.py (it was broken due to changes you did in the module)
Here is what I get still after the changes:
(baybe-dev) ➜ baybe git:(feature/threaded_telemetry) ✗ python tests/simulate_telemetry.py
Actual User Details: {'host': 'REDACTED', 'user': 'REDACTED', 'version': '0.12.2.post11'}
Exception ignored in: <module 'collections.abc' from 'REDACTED/lib/python3.10/collections/abc.py'>
Traceback (most recent call last):
File "<string>", line 4, in <module>
File "REDACTED/lib/python3.10/abc.py", line 115, in register
return _abc_register(cls, subclass)
File "REDACTED/lib/python3.10/abc.py", line 123, in __subclasscheck__
return _abc_subclasscheck(cls, subclass)
RecursionError: maximum recursion depth exceeded in comparison
<frozen importlib._bootstrap>:241: RuntimeWarning: Cython module failed to patch module with custom type
Fake User1
Fake User1a
Fake User2
Fake User3
(after a minute or so:)
Cannot call collect on a MetricReader until it is registered on a MeterProvider
Cannot call collect on a MetricReader until it is registered on a MeterProvider
@AdrianSosic what is the status here? I guess that should at least be put as a Draft, right?
@AdrianSosic what is the status here? I guess that should at least be put as a Draft, right?
Done. Will be abandoned in case we drop telemetry altogether
abandoned in favor of #615