ray
ray copied to clipboard
[Serve][1/n] Add autoscaling prometheus metrics
https://anyscale-ray--59220.com.readthedocs.build/en/59220/serve/monitoring.html#built-in-ray-serve-metrics
fixes https://github.com/ray-project/ray/issues/59218
docs changes
- [x] refactored the table with all metrics, IMO markdown is easier to read in code
- [x] split the table of metrics in ordered categories. categories are ordered by typical request path
- [x] included a stick diagram of important metrics, show where in the request lifecycle the metric is recorded
- [x] order metrics in table by order in request path
Adding the following new metrics
- ray_serve_deployment_target_replicas: Target number of replicas
Tags: deployment, application
- ray_serve_autoscaling_decision_replicas: Raw decision before bounds
Tags: deployment, application
- ray_serve_autoscaling_total_requests: Total requests seen by autoscaler
Tags: deployment, application
- ray_serve_autoscaling_policy_execution_time_ms: Policy execution time
Tags: deployment, application, policy_scope
- ray_serve_autoscaling_replica_metrics_delay_ms: Replica metrics delay
Tags: deployment, application, replica
- ray_serve_autoscaling_handle_metrics_delay_ms: Handle metrics delay
Tags: deployment, application, handle