RustyVault icon indicating copy to clipboard operation
RustyVault copied to clipboard

Instrument RustyVault with Prometheus

Open cybershang opened this issue 5 months ago • 3 comments

Instrument RustyVault with Prometheus

Design

To monitor the system's performance effectively, I applied both the USE and RED methods for metrics collection in RustyVault.

  • USE Method (Utilization, Saturation, Errors):

    Track resource utilization and detect bottlenecks. Metrics related to system resources have been added to ensure the system's health is continuously monitored:

    • CPU Utilization: Measures the percentage of CPU usage by the RustyVault service.

    • Memory Utilization: Tracks memory usage, including total, free, and cached memory.

    • Disk I/O Saturation: Monitors disk read/write speed and detects potential bottlenecks.

    • Network I/O Saturation: Tracks the amount of data sent and received.

  • RED Method (Rate, Errors, Duration)

    Track the behavior of requests within the application:

    • Rate: We implemented requests_total to track the rate of requests coming into the system. This allows us to monitor the overall throughput.

    • Errors: The errors_total counter tracks the number of failed requests and helps monitor the system's error rate.

    • Duration: Using request_duration_seconds, we measure the time taken to process each request, enabling us to analyze latency and potential performance issues.

Implemented Metrics

  • System Metrics
    • CPU
      • cpu_usage_percent: <Gauge, AtomicU64>
    • Memory
      • total_memory: <Gauge, AtomicU64>
      • used_memory: <Gauge, AtomicU64>
      • free_memory: <Gauge, AtomicU64>
    • Disk
      • total_disk_space: <Gauge, AtomicU64>
      • total_disk_available: <Gauge, AtomicU64>
    • Network
      • network_in_bytes: <Gauge, AtomicU64>
      • network_out_bytes: <Gauge, AtomicU64>
    • Load
      • load_average:
  • HTTP Request Metrics
    • struct HttpLabel {path:String, method:MetricsMethod, status:u16}
    • http_request_count: Family<HttpLabel, Counter>
    • http_request_duration_seconds: Family<HttpLabel, Histogram>

Changes

  1. Dependency Imported
  • prometheus-client = "0.22.3"
  • tokio = "1.40.0"
  • sysinfo = "0.31.4"
  1. MetricsManager Implementation:
  • Implemented MetricsManager in manager.rs to store Prometheus Registry, system metrics (system_metrics), and HTTP API metrics (http_metrics).
  • Integrated metrics_manager into the server in src/cli/command/server.rs by inserting it into app_data.
  1. Implemented metrics_handler:
  • Implemented init_metrics_service in metrics.rs, Sets up the /metrics service by configuring a route in the ServiceConfig. Associates the /metrics route with metrics_handler to handle GET requests and respond with Prometheus metrics in text format.
  1. System Metrics Collection:

    • Implemented SystemMetrics struct in system_metrics.rs to gather CPU, memory, load, and disk metrics using the sysinfo crate.
    • Added collect_metrics function to collect and store system information.
    • Launched the start_collecting method in server.block_on to periodically collect system metrics.
  2. HTTP Middleware:

    • Implemented MetricsMiddleware in middleware.rs as a function middleware to capture HTTP request metrics.
    • Configured the HTTP server in src/cli/command/server.rs to apply the middleware using .wrap(from_fn(metrics_middleware)).
    • Transformed Actix-web's HTTP methods into a custom MetricsMethod enum, tracking GET, POST, PUT, DELETE, and categorizing others as OTHER.
    • Recorded request duration by logging start and end timestamps for each request.
  3. HTTP Metrics:

    • Created HttpMetrics struct in http_metrics.rs to handle HTTP request counting and duration observation.
    • Registered two Prometheus metrics: requests counter and histogram for request durations.
    • Added methods increment_request_count and observe_duration for tracking requests and their durations, labeled by HTTP method and path.

Testing Steps

  1. Start RustyVault Service:
    • Ensure that Prometheus integration is enabled in the configuration.
  2. Access Metrics Endpoint:
    • Open a browser or use curl to visit http://localhost:<PORT>/metrics.
    • Verify that Prometheus metrics are correctly displayed.
  3. Trigger Various Requests:
    • Successful Requests:
      • Send valid requests to endpoints like /login and /register.
      • Confirm that requests_total and request_duration_seconds increment appropriately.
    • Failed Requests:
      • Send invalid or malformed requests to induce errors.
      • Check that errors_total increments accordingly.
  4. Integrate with Prometheus Server:
    • Add RustyVault's /metrics endpoint to the Prometheus configuration.
  5. Using Grafana Dashboard:
  • Use a Grafana dashboard to visualize the collected metrics and demonstrate the data.

image

cybershang avatar Sep 17 '24 00:09 cybershang