opentelemetry-collector icon indicating copy to clipboard operation
opentelemetry-collector copied to clipboard

Exporting failed. Dropping data for host node logs

Open ba1ajinaidu opened this issue 1 year ago • 7 comments
trafficstars

Describe the bug I'm trying to export the host level logs from a kubernetes node to quickwit using otel-collector, using filelog receiver to read /var/log/messages file and exporting it. But the logs exporting fails with below error logs

2024-03-15T17:10:37.980Z	error	exporterhelper/queue_sender.go:97	Exporting failed. Dropping data.	{"kind": "exporter", "data_type": "logs", "name": "otlp", "error": "not retryable error: Permanent error: rpc error: code = Internal desc = ", "dropped_items": 2}
go.opentelemetry.io/collector/exporter/exporterhelper.newQueueSender.func1
	go.opentelemetry.io/collector/[email protected]/exporterhelper/queue_sender.go:97
go.opentelemetry.io/collector/exporter/internal/queue.(*boundedMemoryQueue[...]).Consume
	go.opentelemetry.io/collector/[email protected]/internal/queue/bounded_memory_queue.go:57
go.opentelemetry.io/collector/exporter/internal/queue.(*Consumers[...]).Start.func1
	go.opentelemetry.io/collector/[email protected]/internal/queue/consumers.go:43

Steps to reproduce install otel-collector with the below give helm-values and install quickwit with default helm values

What did you expect to see? Logs should be exported to quickwit

What did you see instead?

logs are not being exported and are dropped

What version did you use? 0.96.0

What config did you use? used helm chart to install collector

helm-values.yml

mode: daemonset
presets:
  logsCollection:
    enabled: true
  kubernetesEvents:
    enabled: true

extraVolumes:
  - name: varlog
    hostPath:
      path: /var/log
extraVolumeMounts:
  - name: varlog
    readOnly: true
    mountPath: /var/log
initContainers:
  - name: init-fs
    image: busybox:latest
    command:
      - sh
      - "-c"
      - "chown -R 10001: /var/log"
    volumeMounts:
      - name: varlog
        mountPath: /var/log
config:
  receivers:
    filelog/host:
      include:
        - /var/log/messages
  exporters:
    otlp:
      endpoint: quickwit-indexer.quickwit.svc.cluster.local:7281
      tls:
        insecure: true
  service:
    pipelines:
      logs/system:
        receivers: [filelog/host]
        processors: [batch]
        exporters: [otlp]

Environment

Additional context

ba1ajinaidu avatar Mar 15 '24 17:03 ba1ajinaidu

@ba1ajinaidu it looks like only your exporter is having trouble. Check that your endpoint/port is correct and available

TylerHelmuth avatar Mar 18 '24 18:03 TylerHelmuth

@TylerHelmuth I checked endpoint/port both of them are correct and are working for other logs, it still fails for this file.

ba1ajinaidu avatar Mar 18 '24 18:03 ba1ajinaidu

Oh interesting. Can you add a debug exporter with verbosity: detailed to the pipeline and isolate it to only the troubled file?

TylerHelmuth avatar Mar 18 '24 19:03 TylerHelmuth

Flags: 0
LogRecord #1
ObservedTimestamp: 2024-03-19 03:18:16.297599282 +0000 UTC
Timestamp: 1970-01-01 00:00:00 +0000 UTC
SeverityText:
SeverityNumber: Unspecified(0)
Body: Str(Mar 19 03:18:16 ip-10-70-255-117 kubelet: E0319 03:18:16.276293    1580 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"aws-eks-nodeagent\" with CrashLoopBackOff: \"back-off 5m0s restarting failed container=aws-eks-nodeagent pod=aws-node-qn2wb_kube-system(faba3791-df15-4e0b-8a1c-1089a6f6db10)\"" pod="kube-system/aws-node-qn2wb" podUID="faba3791-df15-4e0b-8a1c-1089a6f6db10")
Attributes:
     -> log.file.name: Str(messages)
Trace ID:
Span ID:
Flags: 0
	{"kind": "exporter", "data_type": "logs", "name": "debug"}
2024-03-19T03:18:16.399Z	error	exporterhelper/queue_sender.go:97	Exporting failed. Dropping data.	{"kind": "exporter", "data_type": "logs", "name": "otlp", "error": "not retryable error: Permanent error: rpc error: code = Internal desc = ", "dropped_items": 2}
go.opentelemetry.io/collector/exporter/exporterhelper.newQueueSender.func1
	go.opentelemetry.io/collector/[email protected]/exporterhelper/queue_sender.go:97
go.opentelemetry.io/collector/exporter/internal/queue.(*boundedMemoryQueue[...]).Consume
	go.opentelemetry.io/collector/[email protected]/internal/queue/bounded_memory_queue.go:57
go.opentelemetry.io/collector/exporter/internal/queue.(*Consumers[...]).Start.func1
	go.opentelemetry.io/collector/[email protected]/internal/queue/consumers.go:43
2024-03-19T03:18:16.501Z	info	LogsExporter	{"kind": "exporter", "data_type": "logs", "name": "debug", "resource logs": 1, "log records": 4}
2024-03-19T03:18:16.501Z	info	ResourceLog #0

ba1ajinaidu avatar Mar 19 '24 03:03 ba1ajinaidu

And other logs flow through this pipeline to the same endpoint without issue?

Can you point your OTLP exporter to a otlp receiver in another collector (or another pipeline in this collector)? I want to make sure there isn't something in the data that isn't being handled correctly in OTLP (this is extremely unlikely)

TylerHelmuth avatar Mar 19 '24 19:03 TylerHelmuth

And other logs flow through this pipeline to the same endpoint without issue?

Yes. Tried pointing the exporter to a receiver in a new pipeline, still seeing the same error

ba1ajinaidu avatar Mar 20 '24 02:03 ba1ajinaidu

Any resolution that you found for this error? I am getting same error when the exporter is tempo

ghost avatar Oct 07 '24 21:10 ghost