elasticsearch [ML] Inference duration and error metrics

Add es.inference.requests.time metric around infer API.

As recommended by OTel spec, errors are determined by the presence or absence of the error.type attribute in the metric. "error.type" will be the http status code (as a string) if it is available, otherwise it will be the name of the exception (e.g. NullPointerException).

Additional notes:

ApmInferenceStats is merged into InferenceStats. Originally we planned to have multiple implementations, but now we're only using APM.
Request count is now always recorded, even when there are failures loading the endpoint configuration.
Added a hook in streaming for cancel messages, so we can close the metrics when a user cancels the stream.

Example from local node to APM (redacted a bunch):

{
  "_index": ".ds-metrics-apm.app.elasticsearch-default-2024.10.25-000001",
    "data_stream": {
      "dataset": "apm.app.elasticsearch",
      "namespace": "default",
      "type": "metrics"
    },
    "es": {
      "inference": {
        "requests": {
          "time": {
            "values": [
              6992.5
            ],
            "counts": [
              1
            ]
          }
        }
      }
    },
    "labels": {
      "model_id": "gpt-4o-mini",
      "otel_instrumentation_scope_name": "elasticsearch",
      "service": "openai",
      "task_type": "completion"
    },
    "numeric_labels": {
      "status_code": 200
    },
    ...
  }
}