Duplicate logs in loki with multiple replicas of Promtail
Describe the bug I have installed Promtail as a kubernetes deployment running 3 replicas (instead of Daemonset) and read the log lines being written to an Azure file share that is mounted to the application pods (using static_configs). I have added a timestamp pipeline stage to the Promtail config so that the timestamp is added from the log line itself instead of promtail adding the timestamp at which the log line is read.
snippets: pipelineStages: - json: expressions: namespace: timestamp: node_message: message - json: expressions: message: data source: node_message - timestamp: source: timestamp format: UnixMs - labels: namespace: - output: source: message
scrapeConfigs: | - job_name: local pipeline_stages: {{- toYaml .Values.config.snippets.pipelineStages | nindent 4 }} static_configs: - targets: - localhost labels: __path__: "/var/log/**/*.log"
As all the 3 replicas are reading from the same log file, the log line would be ingested to loki thrice with the same timestamp and same set of labels. But when I query the logs using grafana or logcli, I see duplicate entries for the same log line.
The issue disappears when only one replica of Promtail is running.
To Reproduce Steps to reproduce the behavior:
- Install Promtail as a Deployment (instead of Daemonset) with 3 replicas. [promtail helm chart v6.2.1]
- Configure Promtail to discover logs from a file system using static_configs
- Configure Promtail to use the timestamp available in the log line
- Write the logs to the log file and wait for the logs to be ingested to loki.
- Wait for 5 minutes within which the duplicate log lines would start appearing in grafana / logcli
Expected behavior As the timestamp and the labels are the same for these log lines, I am expecting loki to dedupe or consider the log line as the same and avoid showing duplicates.
Environment:
- Infrastructure: Kubernetes
- Deployment tool: Helm [promtail v6.2.1, loki v1.8.0]
Screenshots, Promtail config, or terminal output

I am not sure if I am entirely correct here, however this seems to be documented for the boltdb-shipper here: https://grafana.com/docs/loki/latest/operations/storage/boltdb-shipper/#write-deduplication-disabled
I have not configured chunk_store_config to use memcached in this case. But the replication_factor is set to 3 by default.
As per https://grafana.com/docs/loki/latest/fundamentals/architecture/components/#querier, the querier should at least dedupe the data before returning the results.
As per the below document, if the incoming line exactly matches the previously received line (matching both the previous timestamp and log text), the incoming line will be treated as an exact duplicate and ignored.
https://grafana.com/docs/loki/latest/fundamentals/architecture/components/#timestamp-ordering
But this doesn't seem to be happening.
I wonder if the timestamp is actually the same though? Can't it be off in the realm of ms?
The Time and tsNs are exactly the same for these duplicate logs. The timestamp is actually read from the log line itself instead of the time at which the log line is read by promtail. Please find attached the screenshot below,

Here is the log line read by promtail that has the timestamp 1662969080523
{"instance_type":"test","level":"info","message":{"data":{"message":"Loading palette nodes","pod_name":"del-***-3mi-test-***-d6d4db99f-jmkqf"}},"namespace":"del-jay-3mi-test","timestamp":1662969080523}
Thanks for the information, you knew more than I have. Hopefully someone more qualified will be able to take a look :)
I'm not sure if this is related but I also have duplicate log entries for my docker container. It's a container that runs every minute so maybe Promtail does not like that? I see 10 times each log line when I query with Loki.
But I don't have any kind of high availability setup, it's a single Promtail pushing to a single Loki.
hi everyone, is this issue still present?