ollama icon indicating copy to clipboard operation
ollama copied to clipboard

Add metrics endpoint and basic request metrics otel based

Open amila-ku opened this issue 1 year ago • 12 comments

Resolves https://github.com/ollama/ollama/issues/3144

This pull request is to add /metrics endpoint and http metrics as a starting point. It uses otel metrics libary and exposes metrics in prometheus format.

This PR does not try to cover all metrics to keep it simple. If this looks good. I could add few more that will be useful.

How to test: once Ollama server is running pull a model and list(or any other Ollama actions)

curl http://127.0.0.1:11434/metrics

example of custom metrics(not all are shown since i tried only few commands):

Ollama request metrics:

% curl http://localhost:11434/metrics | grep -i ollama
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  5724    0  5724    0     0  1092k      0 --:--:-- --:--:-- --:--:-- 1117k
model_actions_total{action="list",otel_scope_name="ollama",otel_scope_version="",status="OK",status_code="200"} 1
otel_scope_info{otel_scope_name="ollama",otel_scope_version=""} 1
requests_total{action="all",otel_scope_name="ollama",otel_scope_version="",status="OK",status_code="200"} 2
target_info{service_name="unknown_service:ollama",telemetry_sdk_language="go",telemetry_sdk_name="opentelemetry",telemetry_sdk_version="1.27.0"} 1

All metrics :

% curl http://localhost:11434/metrics
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 2.2875e-05
go_gc_duration_seconds{quantile="0.25"} 2.2875e-05
go_gc_duration_seconds{quantile="0.5"} 3.0375e-05
go_gc_duration_seconds{quantile="0.75"} 3.0375e-05
go_gc_duration_seconds{quantile="1"} 3.0375e-05
go_gc_duration_seconds_sum 5.325e-05
go_gc_duration_seconds_count 2
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 10
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.22.0"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 3.247952e+06
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 5.271112e+06
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 11524
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 21294
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 3.159336e+06
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 3.247952e+06
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 1.998848e+06
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 5.7344e+06
# HELP go_memstats_heap_objects Number of allocated objects.
# TYPE go_memstats_heap_objects gauge
go_memstats_heap_objects 18185
# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
# TYPE go_memstats_heap_released_bytes gauge
go_memstats_heap_released_bytes 1.88416e+06
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 7.733248e+06
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
# TYPE go_memstats_last_gc_time_seconds gauge
go_memstats_last_gc_time_seconds 1.724793883369717e+09
# HELP go_memstats_lookups_total Total number of pointer lookups.
# TYPE go_memstats_lookups_total counter
go_memstats_lookups_total 0
# HELP go_memstats_mallocs_total Total number of mallocs.
# TYPE go_memstats_mallocs_total counter
go_memstats_mallocs_total 39479
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
# TYPE go_memstats_mcache_inuse_bytes gauge
go_memstats_mcache_inuse_bytes 9600
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 15600
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
# TYPE go_memstats_mspan_inuse_bytes gauge
go_memstats_mspan_inuse_bytes 137120
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 146880
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 5.69504e+06
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 1.404724e+06
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 655360
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 655360
# HELP go_memstats_sys_bytes Number of bytes obtained from system.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 1.3126672e+07
# HELP go_threads Number of OS threads created.
# TYPE go_threads gauge
go_threads 11
# HELP list_requests_total The total number of model list requests that have been attempted.
# TYPE list_requests_total counter
list_requests_total{action="",otel_scope_name="ollama",otel_scope_version="",status="",status_code="0"} 4
# HELP otel_scope_info Instrumentation Scope metadata
# TYPE otel_scope_info gauge
otel_scope_info{otel_scope_name="ollama",otel_scope_version=""} 1
# HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
# TYPE promhttp_metric_handler_requests_in_flight gauge
promhttp_metric_handler_requests_in_flight 1
# HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
# TYPE promhttp_metric_handler_requests_total counter
promhttp_metric_handler_requests_total{code="200"} 1
promhttp_metric_handler_requests_total{code="500"} 0
promhttp_metric_handler_requests_total{code="503"} 0
# HELP requests_total The total number of requests on all endpoints.
# TYPE requests_total counter
requests_total{action="all",otel_scope_name="ollama",otel_scope_version="",status="OK",status_code="200"} 3
# HELP target_info Target metadata
# TYPE target_info gauge
target_info{service_name="unknown_service:ollama",telemetry_sdk_language="go",telemetry_sdk_name="opentelemetry",telemetry_sdk_version="1.27.0"} 1

amila-ku avatar Aug 27 '24 23:08 amila-ku

@amila-ku great work here! I'm really looking forward to seeing this integrated. Is there anything I can support with furthering the PR?

msbsh avatar Sep 06 '24 06:09 msbsh

@amila-ku great work here! I'm really looking forward to seeing this integrated. Is there anything I can support with furthering the PR?

Thanks, i will continue to add few tests and clean up the implementation a bit. I feel it would be better to keep this PR as a small starting point, so will try to keep this one simple.

amila-ku avatar Sep 16 '24 20:09 amila-ku

Looking good to me for the existing needs of Ollama server.

yuliyantsvetkov avatar Oct 01 '24 07:10 yuliyantsvetkov

Added otel scope versions and runtime information

% curl http://localhost:11434/metrics | grep -i ollama
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 12454    0 12454    0     0  11.5M      0 --:--:-- --:--:-- --:--:-- 11.8M
http_requests_total{action="all",otel_scope_name="ollama",otel_scope_version="0.55.0",status="OK",status_code="200"} 4
http_requests_total{action="head",otel_scope_name="ollama",otel_scope_version="0.55.0",status="OK",status_code="200"} 2
http_requests_total{action="pull",otel_scope_name="ollama",otel_scope_version="0.55.0",status="OK",status_code="200"} 1
http_requests_total{action="tags",otel_scope_name="ollama",otel_scope_version="0.55.0",status="OK",status_code="200"} 1
otel_scope_info{otel_scope_name="ollama",otel_scope_version="0.55.0"} 1
target_info{process_runtime_description="go version go1.22.5 darwin/arm64",service_name="ollama",service_version="v0.1.0"} 1

amila-ku avatar Oct 13 '24 19:10 amila-ku

@jessegross @jmorganca Please check this out.

amila-ku avatar Oct 14 '24 22:10 amila-ku

Hey Guys really appreciate your effort and hard work for this! Any news on this PR? Would really look forward seeing this in the official branch. thanks a lot !

syndimann avatar Nov 16 '24 21:11 syndimann

+1 is this PR going in?

shivanipatel7 avatar Dec 04 '24 16:12 shivanipatel7

When can this PR be merged, I am looking forward to it

zzjcool avatar Jan 14 '25 02:01 zzjcool

Pretty please?

einyx avatar Feb 03 '25 19:02 einyx

Hi ! 👍🏼 for the PR, I'm waiting for it too !

jacky0wl avatar Feb 13 '25 14:02 jacky0wl

+1 would be great to get this merged

arnesund avatar Feb 17 '25 11:02 arnesund

This would be amazing.

kyletaylored avatar Feb 20 '25 06:02 kyletaylored

Any updates?

mcpeixoto avatar Mar 18 '25 11:03 mcpeixoto

Hello, could this metrics: requests_total include model name, it would be very useful, if multiple models are used in a single instance. And also, if request duration could be exported as well, that would be nice.

xhejtman avatar Mar 19 '25 15:03 xhejtman

I like this too!

S33G avatar Apr 21 '25 20:04 S33G

How does this compare to the existing external frcooper/ollama-exporter? I'd prefer something internal, but that is already existing.

lapo-luchini avatar May 29 '25 10:05 lapo-luchini

I merged the PR to current main in order to test it locally.

lapo-luchini avatar Jun 21 '25 13:06 lapo-luchini

@amila-ku I think we have this in docker model runner if you want to contribute there:

https://github.com/docker/model-runner

Please star, fork and contribute!

ericcurtin avatar Oct 13 '25 23:10 ericcurtin

Is there any update on this?

ahmad-asadi avatar Nov 22 '25 11:11 ahmad-asadi