loki
loki copied to clipboard
Support receiving logs in Loki using OpenTelemetry OTLP
Is your feature request related to a problem? Please describe. I am running Grafana Loki inside a Kubernetes cluster but I have some applications running outside the cluster and I want to get logging data from those applications into Loki without relying on custom APIs or file-based logging.
Describe the solution you'd like OpenTelemetry describes a number of approaches including using the OpenTelemetry Collector. The OpenTelemetry Collector supports various types of exporters and the OTLP exporter supports logs, metrics, and traces. Tempo supports receiving trace data via OTLP and it would be great if Loki also had support for receiving log data via OTLP. This way, people could run the OpenTelemetry Collector next to their applications and send logs into Loki in a standard way using the OpenTelemetry New First-Party Application Logs recommendations.
Currently, unless I am misunderstanding the Loki documentation, it seems the only API into Loki is custom:
Details on the OTLP specification:
- OTLP/gRPC: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/otlp.md#otlpgrpc
- OTLP/HTTP: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/otlp.md#otlphttp
Describe alternatives you've considered There are a number of Loki Clients that one can use to get logs into Loki but they all seem to involve using the custom Loki push API or reading from log files. Supporting the OpenTelemetry Collector would allow following the OpenTelemetry New First-Party Application Logs recommendations
Additional context Add any other context or screenshots about the feature request here.
done .https://github.com/grafana/loki/pull/5363
1: grafana otlp log view
2 go client mod.go dependency
go.opentelemetry.io/collector/model v0.44.0
demo go client code:
import (
"context"
"testing"
"time"
"github.com/stretchr/testify/require"
"go.opentelemetry.io/collector/model/otlpgrpc"
"go.opentelemetry.io/collector/model/pdata"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials/insecure"
)
func TestGrpcClient(t *testing.T) {
grpcEndpoint := "localhost:4317"
//client
addr := grpcEndpoint
conn, err := grpc.Dial(addr, grpc.WithTransportCredentials(insecure.NewCredentials()))
require.NoError(t, err)
client := otlpgrpc.NewLogsClient(conn)
request := markRequest()
_, err = client.Export(context.Background(), request)
require.NoError(t, err)
}
func markRequest() otlpgrpc.LogsRequest {
request := otlpgrpc.NewLogsRequest()
pLog := pdata.NewLogs()
pmm := pLog.ResourceLogs().AppendEmpty()
pmm.Resource().Attributes().InsertString("app", "testApp")
ilm := pmm.InstrumentationLibraryLogs().AppendEmpty()
ilm.InstrumentationLibrary().SetName("testName")
now := time.Now()
logReocrd := ilm.LogRecords().AppendEmpty()
logReocrd.SetName("testName")
logReocrd.SetFlags(31)
logReocrd.SetSeverityNumber(1)
logReocrd.SetSeverityText("WARN")
logReocrd.SetSpanID(pdata.NewSpanID([8]byte{1, 2}))
logReocrd.SetTraceID(pdata.NewTraceID([16]byte{1, 2, 3, 4}))
logReocrd.Attributes().InsertString("level", "WARN")
logReocrd.SetTimestamp(pdata.NewTimestampFromTime(now))
logReocrd2 := ilm.LogRecords().AppendEmpty()
logReocrd2.SetName("testName")
logReocrd2.SetFlags(31)
logReocrd2.SetSeverityNumber(1)
logReocrd2.SetSeverityText("INFO")
logReocrd2.SetSpanID(pdata.NewSpanID([8]byte{3, 4}))
logReocrd2.SetTraceID(pdata.NewTraceID([16]byte{1, 2, 3, 4}))
logReocrd2.Attributes().InsertString("level", "WARN")
logReocrd2.SetTimestamp(pdata.NewTimestampFromTime(now))
request.SetLogs(pLog)
return request
}
Hi! This issue has been automatically marked as stale because it has not had any activity in the past 30 days.
We use a stalebot among other tools to help manage the state of issues in this project. A stalebot can be very useful in closing issues in a number of cases; the most common is closing issues or PRs where the original reporter has not responded.
Stalebots are also emotionless and cruel and can close issues which are still very relevant.
If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry.
We regularly sort for closed issues which have a stale
label sorted by thumbs up.
We may also:
- Mark issues as
revivable
if we think it's a valid issue but isn't something we are likely to prioritize in the future (the issue will still remain closed). - Add a
keepalive
label to silence the stalebot if the issue is very common/popular/important.
We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, our sincere apologies if you find yourself at the mercy of the stalebot.
May I ask, whats the current state here? :)
@frzifus The status of this is that we still miss the API, but the key storage issue is addressed by non-indexed labels (See upcoming docs PR: https://github.com/grafana/loki/pull/10073). As @slim-bean mentioned in his earlier comment, we need an efficient storage for OLTP labels: and AFAIU and as mentioned in the last NASA community call we are close to non-indexed labels:
Any update on the native OTLP support ?
@sandeepsukhani might know a thing or two about this :-)
Hey folks, we have added experimental OTLP log ingestion support to Loki. It has yet to be released, so you would have to use the latest main to try it. You can read more about it in the docs. Please give it a try in your dev environments and share any feedback or suggestions.
Hi, really looking forward to that feature :)
I saw that a service.instance.id
will be considered a label, doesn't this have the potential to be a high cardinality value?
Also will it be possible to customize the "labels" list? In our case we run nomad so the k8s....
resource attributes wouldn't really work for us. But we would have resource attributes like nomad.job.name
which would make sense for us as labels
@sandeepsukhani looks good, I'll give it a try next week.
One immediate suggestion is that I'd like to be able to configure the indexed labels so I can add/remove items from the list. Perhaps it should default to the list you have in the docs and then the user can provide their own list to override it.
Also I see the span_id and trace_id are currently metadata, shouldn't the trace_id at least be indexed so I can correlate logs to traces?
Another suggestion is that the conversion adds a 'severity_number' metadata attribute which is not very useful, instead it should map it to a 'level' field like the opentelemetry collector translator does: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/translator/loki/logs_to_loki.go.
Hi, Can I ask if there are plans to support grpc? Or maybe I missed some documentation and it's actually supported now?
Is this supported in Loki v3? I get 404 error when calling the endpoint
Does anyone know if this is now possible?
I have an OTEL collector running on a k8s cluster which I would like to gather logs from and send over to a remote loki stack running on another k8s cluster. I'm hoping to achieve this via OTLP HTTP, and there is documentation that seems to indicate that this is possible. However, after following the documentation I haven't had any success. Sending logs from the OTEL collector to a remote Loki instance should be possible through OTLP HTTP, right?
Yes, Loki v3 includes an OTLP port to ingest OTLP Logs natively.
Does anyone know if this is now possible?
I have an OTEL collector running on a k8s cluster which I would like to gather logs from and send over to a remote loki stack running on another k8s cluster. I'm hoping to achieve this via OTLP HTTP, and there is documentation that seems to indicate that this is possible. However, after following the documentation I haven't had any success. Sending logs from the OTEL collector to a remote Loki instance should be possible through OTLP HTTP, right?
A quick update on this - I was able to receive logs via OTLP HTTP successfully. Turned out to be a mistake with my config.
Solved
Well nevermind. I eventually found out I was just using a wrong version of loki (grafana/loki
instead of grafana/loki:3.0.0
) and the OTLP endpoint wasn't ready yet. So if you have the issue described below, just upgrade 🤷
Original Problem
A quick update on this - I was able to receive logs via OTLP HTTP successfully. Turned out to be a mistake with my config.
@alextricity25 Would you mind sharing how you got this to work? I am currently stuck at a stage where the collector gives me this error:
2024-06-23T17:30:51.116Z error exporterhelper/queue_sender.go:90 Exporting failed. Dropping data. {"kind": "exporter", "data_type": "logs", "name": "otlphttp", "error": "not retryable error: Permanent error: rpc error: code = Unimplemented desc = error exporting items, request to http://loki.telemetry.svc.cluster.local:3100/otlp/v1/logs responded with HTTP Status Code 404", "dropped_items": 10}
go.opentelemetry.io/collector/exporter/exporterhelper.newQueueSender.func1
go.opentelemetry.io/collector/[email protected]/exporterhelper/queue_sender.go:90
go.opentelemetry.io/collector/exporter/internal/queue.(*boundedMemoryQueue[...]).Consume
go.opentelemetry.io/collector/[email protected]/internal/queue/bounded_memory_queue.go:52
go.opentelemetry.io/collector/exporter/internal/queue.(*Consumers[...]).Start.func1
go.opentelemetry.io/collector/[email protected]/internal/queue/consumers.go:43
For reference, this is the config for the collector I am currently deploying using the operator:
Click to expand
# OpenTelemetry Operator
apiVersion: opentelemetry.io/v1beta1
kind: OpenTelemetryCollector
metadata:
name: otel-collector
namespace: telemetry
spec:
image: otel/opentelemetry-collector-contrib:0.103.0
serviceAccount: otel-collector
mode: daemonset
volumeMounts:
# Mount the volumes to the collector container
- name: varlogpods
mountPath: /var/log/pods
readOnly: true
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
volumes:
# Typically the collector will want access to pod logs and container logs
- name: varlogpods
hostPath:
path: /var/log/pods
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
config:
receivers:
otlp:
protocols:
grpc: {}
http: {}
filelog:
include_file_path: true
include:
- /var/log/pods/*/*/*.log
exclude:
- /var/log/pods/telemetry_otel-collector*/*/*.log
operators:
- id: container-parser
type: container
processors:
batch: {}
exporters:
logging:
loglevel: debug
otlphttp:
endpoint: http://loki.telemetry.svc.cluster.local:3100/otlp
compression: none
tls:
insecure: true
prometheus:
endpoint: "0.0.0.0:8889"
service:
pipelines:
metrics:
receivers: [otlp]
processors: [batch]
exporters: [prometheus]
logs:
receivers: [otlp,filelog]
processors: [batch]
exporters: [logging, otlphttp]
And, the loki config:
Click to expand
auth_enabled: false
server:
http_listen_port: 3100
grpc_listen_port: 9095
common:
path_prefix: /loki
storage:
filesystem:
chunks_directory: /loki/chunks
rules_directory: /loki/rules
replication_factor: 1
ring:
kvstore:
store: inmemory
query_range:
results_cache:
cache:
embedded_cache:
enabled: true
max_size_mb: 100
schema_config:
configs:
- from: 2024-04-01
object_store: s3
store: tsdb
schema: v13
index:
prefix: index_
period: 24h
storage_config:
tsdb_shipper:
active_index_directory: /loki/tsdb-index
cache_location: /loki/tsdb-cache
aws:
s3: s3://minioadmin:[email protected]:9000/loki-data
s3forcepathstyle: true
limits_config:
retention_period: 744h
enforce_metric_name: false
reject_old_samples: true
reject_old_samples_max_age: 168h
max_global_streams_per_user: 5000
ingestion_rate_mb: 10
ingestion_burst_size_mb: 20
allow_structured_metadata: true
# chunk_store_config:
# max_look_back_period: 744h
table_manager:
retention_deletes_enabled: true
retention_period: 744h
ruler:
storage:
type: local
local:
directory: /loki/rules
rule_path: /loki/rules-temp
# alertmanager_url: http://alertmanager:9093 TODO deploy alertmanager
ring:
kvstore:
store: inmemory
enable_api: true
query_scheduler:
max_outstanding_requests_per_tenant: 2048
frontend:
max_outstanding_per_tenant: 2048
compress_responses: true
This looks suspiciously like the OTLP collector is still using gRPC, but this is exactly what the docs tell me to do so I am clueless here. Any help would be appreciated
The documentation lists http://<loki-addr>:3100/otlp
as endpoint for the oltphttp exporter, but the actual endpoint is http://<loki-addr>:3100/otlp/v1/logs
; this explains the 404.
Well nevermind. I eventually found out I was just using a wrong version of loki (grafana/loki instead of grafana/loki:3.0.0) and the OTLP endpoint wasn't ready yet. So if you have the issue described below, just upgrade 🤷
The official docker-compose.yml
still contains the old version
https://github.com/grafana/loki/blob/0a7e9133590ffb361b9c4eb6c4b8a5b772d83676/production/docker-compose.yaml#L6-L8
The documentation lists http://
:3100/otlp as endpoint for the oltphttp exporter, but the actual endpoint is http:// :3100/otlp/v1/logs; this explains the 404.
Indeed, I found the /otlp
reference in https://grafana.com/docs/loki/latest/send-data/. From https://grafana.com/docs/loki/latest/reference/loki-http-api/#ingest-logs-using-otlp,
When configuring the OpenTelemetry Collector, you must use endpoint: http://
:3100/otlp, as the collector automatically completes the endpoint. Entering the full endpoint will generate an error.
So there's some inconsistency somewhere in the docs or the examples.
Using Alloy, endpoint = "http://localhost:3100/otlp"
works, but if you want to log directly, e.g., you need OTEL_EXPORTER_OTLP_LOGS_ENDPOINT=http://localhost:3100/otlp/v1/logs
The documentation lists
http://<loki-addr>:3100/otlp
as endpoint for the oltphttp exporter, but the actual endpoint ishttp://<loki-addr>:3100/otlp/v1/logs
; this explains the 404.
I met the same problem. Have you solved yet ?
The documentation lists
http://<loki-addr>:3100/otlp
as endpoint for the oltphttp exporter, but the actual endpoint ishttp://<loki-addr>:3100/otlp/v1/logs
; this explains the 404.I met the same problem. Have you solved yet ?
I solved this problem by upgrading the Loki docker container to version 3.1.1.
Closing this issue with the introduction of the native OTLP endpoint. Please reopen if required :)