haystack Introduce basic latency/performance events for node runs

Introduce basic latency/performance events for node runs

Open tstadel opened this issue 2 years ago • 0 comments

As a devops engineer I want to know roughly how much time a request spends in each node.

Is your feature request related to a problem? Please describe. Currently we don't know how much time a request spends in specific nodes. For analyzing and optimizing slow pipelines it is an essential information to get started.

Describe the solution you'd like We introduce some logs that trace:

when a node starts processing a request (DEBUG)
when a node ends processing a request (DEBUG)
how much time a node spent processing a request in total (DEBUG)
Nodename, nodetype, starttime, endtime and total time must be traced as extra to ease building reports in analytics and monitoring tools (STATISTICS)

These logs must be easily switched on and off.

Describe alternatives you've considered Pipelines hold statistics about how often and how long their nodes processed requests. This could give easy access to aggregate numbers. However this would also be more effort, requires additional get_statistics and reset_statistics methods and would not be practical for monitoring live systems as they use log tools like grafana, prometheus or kibana to aggregate event-based numbers.

Additional context Main focus lies on bigger deployments using analytics and monitoring tools. Notebook users should also be able to make use of it.

Aug 09 '22 08:08 tstadel

haystack haystack copied to clipboard

Introduce basic latency/performance events for node runs

haystack
haystack copied to clipboard