loki Duplicate logs in loki with multiple replicas of Promtail

Describe the bug I have installed Promtail as a kubernetes deployment running 3 replicas (instead of Daemonset) and read the log lines being written to an Azure file share that is mounted to the application pods (using static_configs). I have added a timestamp pipeline stage to the Promtail config so that the timestamp is added from the log line itself instead of promtail adding the timestamp at which the log line is read.

snippets: pipelineStages: - json: expressions: namespace: timestamp: node_message: message - json: expressions: message: data source: node_message - timestamp: source: timestamp format: UnixMs - labels: namespace: - output: source: message

scrapeConfigs: | - job_name: local pipeline_stages: {{- toYaml .Values.config.snippets.pipelineStages | nindent 4 }} static_configs: - targets: - localhost labels: __path__: "/var/log/**/*.log"

As all the 3 replicas are reading from the same log file, the log line would be ingested to loki thrice with the same timestamp and same set of labels. But when I query the logs using grafana or logcli, I see duplicate entries for the same log line.

The issue disappears when only one replica of Promtail is running.

To Reproduce Steps to reproduce the behavior:

Install Promtail as a Deployment (instead of Daemonset) with 3 replicas. [promtail helm chart v6.2.1]
Configure Promtail to discover logs from a file system using static_configs
Configure Promtail to use the timestamp available in the log line
Write the logs to the log file and wait for the logs to be ingested to loki.
Wait for 5 minutes within which the duplicate log lines would start appearing in grafana / logcli

Expected behavior As the timestamp and the labels are the same for these log lines, I am expecting loki to dedupe or consider the log line as the same and avoid showing duplicates.

Environment:

Infrastructure: Kubernetes
Deployment tool: Helm [promtail v6.2.1, loki v1.8.0]

Screenshots, Promtail config, or terminal output

Sep 13 '22 09:09 jjayabal23

I am not sure if I am entirely correct here, however this seems to be documented for the boltdb-shipper here: https://grafana.com/docs/loki/latest/operations/storage/boltdb-shipper/#write-deduplication-disabled

Sep 14 '22 20:09 fourstepper

I have not configured chunk_store_config to use memcached in this case. But the replication_factor is set to 3 by default.

As per https://grafana.com/docs/loki/latest/fundamentals/architecture/components/#querier, the querier should at least dedupe the data before returning the results.

Sep 15 '22 05:09 jjayabal23

As per the below document, if the incoming line exactly matches the previously received line (matching both the previous timestamp and log text), the incoming line will be treated as an exact duplicate and ignored.

https://grafana.com/docs/loki/latest/fundamentals/architecture/components/#timestamp-ordering

But this doesn't seem to be happening.

Sep 20 '22 10:09 jjayabal23

I wonder if the timestamp is actually the same though? Can't it be off in the realm of ms?

Sep 20 '22 12:09 fourstepper

The Time and tsNs are exactly the same for these duplicate logs. The timestamp is actually read from the log line itself instead of the time at which the log line is read by promtail. Please find attached the screenshot below,

loki_log

Here is the log line read by promtail that has the timestamp 1662969080523

{"instance_type":"test","level":"info","message":{"data":{"message":"Loading palette nodes","pod_name":"del-***-3mi-test-***-d6d4db99f-jmkqf"}},"namespace":"del-jay-3mi-test","timestamp":1662969080523}

Sep 21 '22 05:09 jjayabal23

Thanks for the information, you knew more than I have. Hopefully someone more qualified will be able to take a look :)

Sep 21 '22 06:09 fourstepper

I'm not sure if this is related but I also have duplicate log entries for my docker container. It's a container that runs every minute so maybe Promtail does not like that? I see 10 times each log line when I query with Loki.

But I don't have any kind of high availability setup, it's a single Promtail pushing to a single Loki.

Oct 10 '22 12:10 cedsana

hi everyone, is this issue still present?

Nov 16 '23 21:11 cstyan