loki Sending logs via systemd-run/logger/systemd-cat drops a small percentage of the messages

To thoroughly test our Loki deployment we wrote a script that does an end-to-end test through this path:

Write to Journal.
Promtail scrapes Journal and ships to Loki.
Loki receives the log and stores it.
A query to loki reveals the unique message written to Journal.

We initially wrote it with the following approaches (trying them out one by one):

logger --journal <message-file>
sytemd-run echo <message>
echo <message> | systemd-cat

All of them turn out to be unreliable. They work most of the time, but maybe one in 10 messages is dropped. When using a long-lived systemd service, the messages work 100% of the time.

What is interesting, is that through logger and systemd-cat, the messages do not show up in promtail -log.level debug, while the systemd-run one does. All of them however, show up in promtail -dry-run. And of course, most of them do make it to Loki.

To Reproduce Steps to reproduce the behavior (see configs further down):

Run Loki.
Run Promtail on another host.
Run the following query on Loki:

logcli query -o raw '{host="hostname"} |= "foo"' --tail

Run any of the following commands on the host:

echo "MESSAGE=foo" > msg; logger --journal msg
echo foo | systemd-cat
systemd-run echo foo

Notice that maybe 1 in 10 messages never reaches the query output.

Expected behavior No messages are dropped.

Environment:

Infrastructure: VMs (Ubuntu), Promtail 2.3, Loki 2.4 (from main-04ab08d).
Deployment tool: Ansible

Promtail Config

server:
  http_listen_port: 9080
  http_listen_address: 127.0.0.1
  grpc_listen_port: 9085
  grpc_listen_address: 127.0.0.1
  log_level: info

positions:
  filename: /var/lib/promtail/positions.yaml

clients:
- url: https://1.2.3.4/distributor/loki/api/v1/push
  # 5 10 20 40 80 160 320 320 ... 320
  # For a total time of 11195s ~185m before logs are lost
  # 5 + 10 + 20 + 40 + 80 + 160 + 320 + 33 * 320
  backoff_config:
    min_period: 5s
    max_period: 320s
    max_retries: 40
  tls_config:
    ca_file: /etc/pki/realms/host/CA.crt
    cert_file: /etc/pki/realms/host/default.crt
    key_file: /etc/pki/realms/host/default.key
    server_name: 1.2.3.4

scrape_configs:
- job_name: journal
  journal:
    json: true
    max_age: 12h
    labels:
      env: test
      component: internal
      path: /var/log/journal
  relabel_configs:
    - source_labels: ['__journal__hostname']
      target_label: 'host'
    - source_labels: ['__journal__systemd_unit']
      target_label: 'unit'
  pipeline_stages:
  - match:
      selector: '{unit!~""}'
      stages:
      - json:
          expressions:
            {'cmdline': '_CMDLINE', 'facility': 'SYSLOG_FACILITY', 'message': 'MESSAGE', 'priority': 'PRIORITY', 'unit': '_SYSTEMD_UNIT'}
      - pack:
          labels:
            ['cmdline', 'facility', 'message', 'priority', 'unit']
      - labeldrop:
          - unit

Loki Config

auth_enabled: false

server:
  log_level: info

distributor:
  ring:
    kvstore:
      store: consul
      consul:
        host: localhost:8500

ingester:
  lifecycler:
    address: 127.0.0.1
    ring:
      kvstore:
        store: consul
        consul:
          host: localhost:8500
      replication_factor: 1
    final_sleep: 0s
  max_transfer_retries: 0
  flush_op_timeout: 60s
  flush_check_period: 30s
  concurrent_flushes: 8
  chunk_target_size: 4194304
  max_chunk_age: 2h
  chunk_idle_period: 2h
  wal:
    enabled: true
#   dir: configured in the service file

schema_config:
  configs:
    - from: 2021-09-03
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

storage_config:
  boltdb_shipper:
    active_index_directory: /srv/logs/loki/index
    cache_location: /srv/logs/loki/index_cache
    shared_store: filesystem
  filesystem:
    directory: /srv/loki/chunks

compactor:
  working_directory: /srv/loki/compactor
  shared_store: filesystem
  compaction_interval: 5m
  retention_enabled: true
  retention_delete_delay: 2h
  retention_delete_worker_count: 150

limits_config:
  reject_old_samples: true
  reject_old_samples_max_age: 168h
  unordered_writes: true
  retention_period: "1d"
  retention_stream:
  - selector: '{host="test"}'
    priority: 1
    period: 12h

chunk_store_config:
  max_look_back_period: 0s

table_manager:
  retention_deletes_enabled: false
  retention_period: 0s

frontend_worker:
  frontend_address: '127.0.0.1:9091'

frontend:
  tail_proxy_url: http://127.0.0.1:3112

querier:
  query_timeout: 2m
  query_ingesters_within: 168h

query_range:
  # make queries more cache-able by aligning them with their step intervals
  align_queries_with_step: true
  max_retries: 5
  # parallelize queries in 15min intervals
  split_queries_by_interval: 15m
  cache_results: true
  parallelise_shardable_queries: true

  results_cache:
    cache:
      # We're going to use the in-process "FIFO" cache
      enable_fifocache: true
      fifocache:
        size: 1024
        validity: 24h

Oct 07 '21 13:10 href

Hi! This issue has been automatically marked as stale because it has not had any activity in the past 30 days.

We use a stalebot among other tools to help manage the state of issues in this project. A stalebot can be very useful in closing issues in a number of cases; the most common is closing issues or PRs where the original reporter has not responded.

Stalebots are also emotionless and cruel and can close issues which are still very relevant.

If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry.

We regularly sort for closed issues which have a stale label sorted by thumbs up.

We may also:

Mark issues as revivable if we think it's a valid issue but isn't something we are likely to prioritize in the future (the issue will still remain closed).
Add a keepalive label to silence the stalebot if the issue is very common/popular/important.

We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, our sincere apologies if you find yourself at the mercy of the stalebot.

Mar 03 '22 02:03 stale[bot]

This bug still exist, though we work around it by not using systemd-run/logger/systemd-cat.

Mar 04 '22 10:03 href

Hi! This issue has been automatically marked as stale because it has not had any activity in the past 30 days.

We use a stalebot among other tools to help manage the state of issues in this project. A stalebot can be very useful in closing issues in a number of cases; the most common is closing issues or PRs where the original reporter has not responded.

Stalebots are also emotionless and cruel and can close issues which are still very relevant.

If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry.

We regularly sort for closed issues which have a stale label sorted by thumbs up.

We may also:

Mark issues as revivable if we think it's a valid issue but isn't something we are likely to prioritize in the future (the issue will still remain closed).
Add a keepalive label to silence the stalebot if the issue is very common/popular/important.

We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, our sincere apologies if you find yourself at the mercy of the stalebot.

Apr 17 '22 05:04 stale[bot]

We can still reproduce this bug.

Apr 19 '22 07:04 href

@href is this still an issue in current release?

Nov 08 '23 01:11 cstyan

Kind of, but now I'm not sure if it's the same issue we've seen before, or if I just discovered a lead:

I generated the messages as described above, and they all show up in Grafana. But not together. Most of the time when I run the query, I get 13 out of 15 messages, but sometimes I get 2 out of 15 messages. From the timestamps I can tell that as a whole all messages are in Loki.

However, they do not show together. In Grafana I see either 13 or 2 messages, with multiple refreshes. In the Loki CLI I only ever see the 13 messages.

We run two querier processes, so maybe there's some kind of disagreement on what they are seeing? Could this be a configuration issue? We didn't have Grafana at the time, so it could still be the same issue, as Loki always seems to show the same set of messages (so my original issue holds).

But in Grafana I can now tell that the messages do make it to Loki. It does not seem to be a Promtail issue.

Nov 08 '23 07:11 href

so via curl or logcli you always see all the expected log results? could you share more the exact query you're running?

we may need to file a grafana issue

Nov 08 '23 22:11 cstyan

I saw the issue in logcli as well:

In Grafana I would usually see 13/15 messages, then the other 2/15 messages (mostly 13/15).
In Loki I would only see 13/15 messages.

Today, when I look for the messages from yesterday, only 13/15 show up, even in Grafana. The 2/15 messages that complement the set seem to be gone.

I ran another test just now, and I saw the following behavior in both logcli and Grafana. This time it's either 25/25, or 0/25:

# After having sent `echo foo | systemd-cat` on `lab-test1-lpg1` 25 times (spaced out).
root@lab-log1-lpg1:~# for i in {1..20}; do logcli query -qo raw '{host="lab-test1-lpg1"} |= "foo" | json | line_format `{{ .message }}`' | wc -l; done
25
25
25
0
0
0
0
0
0
0
0
25
25
25
25
25
25
25
25
25

In Grafana it's the same.

Nov 09 '23 08:11 href

May or may not be relevant but I see similar message dropping behavior in a NixOS systemd script scheduled by a timer using systemd-cat or logger. I use them to set a priority to messages.

Plain echo ... Never fails.

Feb 16 '24 01:02 chunjiw

loki loki copied to clipboard

Sending logs via systemd-run/logger/systemd-cat drops a small percentage of the messages

loki
loki copied to clipboard