conmon icon indicating copy to clipboard operation
conmon copied to clipboard

podman logging with k8s-file missing entries

Open slartibart70 opened this issue 7 months ago • 2 comments

Issue Description

I am running gitlab-ce in rootless podman. This app is more than spamming the journal with log entries (waaay to much and mostly unneccesary infos)

Therefore i decided to use log-driver 'k8s-file' with log-options max_size: "20mb" and 'path=log.json'

This is working fine regarding the redirection of logs, but works only for the first 'batch' of logfile-entries. Example:

2025-05-27T09:58:34.663942371+02:00 stdout F Thank you for using GitLab Docker Image!
2025-05-27T09:58:34.663942371+02:00 stdout F Current version: gitlab-ce=18.0.1-ce.0
2025-05-27T09:58:34.663942371+02:00 stdout F 
2025-05-27T09:58:34.664026975+02:00 stdout F Configure GitLab for your system by editing /etc/gitlab/gitlab.rb file
2025-05-27T09:58:34.664026975+02:00 stdout F And restart this container to reload settings.
2025-05-27T09:58:34.664026975+02:00 stdout F To do it use docker exec:
2025-05-27T09:58:34.664026975+02:00 stdout F 
...
2025-05-27T09:59:09.983421664+02:00 stdout P       * link[/opt/gitlab/init/gitlab-kas] action create
2025-05-27T09:59:09.983996564+02:00 stdout F  (up to date)
2025-05-27T09:59:09.984331287+02:00 stdout P       * file[/opt/gitlab/sv/gitlab-kas/down] action nothing
2025-05-27T09:59:09.984380551+02:00 stdout F  (skipped due to action :nothing)
2025-05-27T09:59:09.984676988+02:00 stdout P       * directory[/opt/gitlab/service] action create
2025-05-27T09:59:09.985339113+02:00 stdout F  (up to date)
2025-05-27T09:59:09.985611214+02:00 stdout P       * link[/opt/gitlab/service/gitlab-kas] action create
2025-05-27T09:59:09.986211203+02:00 stdout F [2025-05-27T07:59:09+00:00] INFO: link[/opt/gitlab/service/gitlab-kas] created
...

This looks fine and identical to the info of the journal (when logging into journal, of course)

If the logs becomes big enough (around 9mb) the last readable entries are:

2025-05-27T10:05:41.909446759+02:00 stdout F 2025-05-27_08:05:40.60530 ts=2025-05-27T08:05:40.605Z caller=tls_config.go:274 level=info msg="Listening on" address=127.0.0.1:9187
2025-05-27T10:05:41.909446759+02:00 stdout F 2025-05-27_08:05:40.60532 ts=2025-05-27T08:05:40.605Z caller=tls_config.go:277 level=info msg="TLS is disabled." http2=false address=127.0.0.1:9187
2025-05-27T10:05:41.909446759+02:00 stdout F 
2025-05-27T10:05:41.909446759+02:00 stdout F ==> /var/log/gitlab/old.json <==

After this, the log only contains entries similar like these:

2025-05-27T10:05:41.909446759+02:00 stdout F 2025-05-27T09:57:59.018347760+02:00 stdout F 2025-05-27T09:57:58.998978708+02:00 stdout F 2025-05-27T09:57:58.986519025+02:00 stdout F 2025-05-27T09:57:58.974253822+02:00 stdout P 2025-05-27T09:57:58.
2025-05-27T10:05:41.909446759+02:00 stdout F 2025-05-27T09:57:59.018347760+02:00 stdout F 2025-05-27T09:57:58.998978708+02:00 stdout F 2025-05-27T09:57:58.986519025+02:00 stdout F 2025-05-27T09:57:58.974270295+02:00 stdout F 962840334+02:00 stdout F 2025-05-27T09:57:58.950363473+02:00 stdout F 2025-05-27T09:57:58.940401862+02:00 stdout F 2025-05-27T09:57:58.931263658+02:00 stdout F 2025-05-27T09:57:58.922672097+02:00 stdout F 2025-05-27T09:57:58.913100554+02:00 stdout F  stdout F 304107+02:00 stdout F 2025-05-27T09:57:58.311233149+02:00 stdout F 2025-05-27T09:57:58.306
2025-05-27T10:05:41.909446759+02:00 stdout P 2025-05-27T09:57:59.018347760+02:00 stdout F 2025-05-27T09:57:58.998978708+02:00 stdout F 2025-05-27T09:57:58.986519025+02:00 stdout F 2025-05-27T09:57:58.974270295+02:00 stdout F 2025-05-27T09:57:58.962840334+02:00 stdout F 2025-05-27T09:57:58.950363473+02:00 stdout F 2025-05-27T09:57:58.940401862+02:00 stdout F 2025-05-27T09:57:58.931263658+02:00 stdout P 2025-05-27T09:57:58.922672097+02:00 stdout F 2025-05-27T09:57:58.913100
2025-05-27T10:05:41.909467909+02:00 stdout F 554+02:00 stdout F 2025-05-27T09:57:58.905843441+02:00 stdout P 2025-05-27T09:57:58.899086862+02:00 stdout F 20
...

2025-05-27T10:05:41.909446759+02:00 stdout F 2025-05-27T09:57:59.018347760+02:00 stdout F 2025-05-27T09:57:58.998978708+02:00 stdout F 185372+02:00 stdout F 2025-05-27T09:57:58.857516820+02:00

Steps to reproduce the issue

i am running this as quadlet/service. Commandline is:

ExecStart=/usr/bin/podman run --name gitlab-ce --cidfile=%t/%N.cid --replace --rm --log-driver k8s-file --log-opt max-size=20mb --log-opt path=/storage/gitlab/gitlab-ce/logs/podman-gitlab-ce-log.json --log-opt tag= --cgroups=split --stop-timeout 500 --sdnotify=conmon -d -v /storage/gitlab/gitlab-ce/config:/etc/gitlab:Z -v /storage/gitlab/gitlab-ce/logs:/var/log/gitlab:Z -v /storage/gitlab/gitlab-ce/data:/var/opt/gitlab:Z --publish 8443:443 --publish 2222:22 --publish 5005:5005 --label systemd.type=generated docker.io/gitlab/gitlab-ce:18.0.1-ce.0

Describe the results you received

see above

Describe the results you expected

A logfile similar to the journal, where such 'garbled' output never happens (even if running for a long time)

Maybe sth. that is easier to reproduce: i tested with alpine and 'yes' to produce a logfile:

podman run --rm -it --log-driver k8s-file --log-opt max-size=1mb --log-opt path=./log.json alpine

Inside the container namespace (podman exec) just run yes "with some very long text and even more text .. 012345678901234567890123456789 .. you get the idea" and monitoring the logfile (tail -f) in a second shell.

I cannot reproduce gitlabs garbled output, but i can see some entries are missing in the log, looking to me like synchronization/flushing problems?

podman info output

podman info 
host:
  arch: amd64
  buildahVersion: 1.40.0
  cgroupControllers:
  - cpu
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.13-1.fc42.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.13, commit: '
  cpuUtilization:
    idlePercent: 97.54
    systemPercent: 1.85
    userPercent: 0.62
  cpus: 4
  databaseBackend: sqlite
  distribution:
    distribution: fedora
    variant: cloud
    version: "42"
  eventLogger: journald
  freeLocks: 2047
  hostname: AZ-CBC-VM-GammaSwat
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 10057
      size: 1
    - container_id: 1
      host_id: 1048576
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 10057
      size: 1
    - container_id: 1
      host_id: 1048576
      size: 65536
  kernel: 6.14.7-300.fc42.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 462880768
  memTotal: 16756625408
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.15.0-1.fc42.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.15.0
    package: netavark-1.15.0-1.fc42.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.15.0
  ociRuntime:
    name: crun
    package: crun-1.21-1.fc42.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.21
      commit: 10269840aa07fb7e6b7e1acff6198692d8ff5c88
      rundir: /run/user/10057/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20250512.g8ec1341-1.fc42.x86_64
    version: ""
  remoteSocket:
    exists: true
    path: /run/user/10057/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 8575066112
  swapTotal: 8589930496
  uptime: 154h 2m 10.00s (Approximately 6.42 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  alpha:5001:
    Blocked: false
    Insecure: true
    Location: alpha:5001
    MirrorByDigestOnly: false
    Mirrors: null
    Prefix: alpha:5001
    PullFromMirror: ""
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
store:
  configFile: /home/gitlab/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 1
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/gitlab/.local/share/containers/storage
  graphRootAllocated: 274877906944
  graphRootUsed: 9654018048
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 1
  runRoot: /run/user/10057/containers
  transientStore: false
  volumePath: /home/gitlab/.local/share/containers/storage/volumes
version:
  APIVersion: 5.5.0
  BuildOrigin: Fedora Project
  Built: 1747180800
  BuiltTime: Wed May 14 02:00:00 2025
  GitCommit: 0dbcb51477ee7ab8d3b47d30facf71fc38bb0c98
  GoVersion: go1.24.3
  Os: linux
  OsArch: linux/amd64
  Version: 5.5.0

Podman in a container

No

Privileged Or Rootless

None

Upstream Latest Release

Yes

Additional environment details

Additional environment details

Additional information

Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting

slartibart70 avatar May 27 '25 08:05 slartibart70

The log file is written by conmon so I move the issue there.

Luap99 avatar May 27 '25 08:05 Luap99

and, one more severe thing; this is consuming 100% cpu load on on core.... To be more precise: when podman/conmon is logging into a file, there was a tail running

tail --follow=name --retry /var/log/gitlab/sshd/current /var/log/gitlab/gitlab-rails/gitlab-rails-db-migrate-2025-05-26-20-12-40.log /var/log/gitlab/gitlab-rails/production.log /var/log/gitlab/gitlab-rails/application_json.log /var/log/gitlab/gitlab-rails/grpc.log /var/log/gitlab/gitlab-rails/database_load_balancing.log /var/log/gitlab/gitlab-rails/service_measurement.log...

consuming 50% of a core, and conmon doing its part

/usr/bin/conmon --api-version 1 -c bb5a2a61e0 ...

consuming ~90% of a core

all this for logging???

Killing the 'tail' command helps, but then logging is also killed

slartibart70 avatar Jun 12 '25 08:06 slartibart70

PR is here: https://github.com/containers/conmon/pull/570

jnovy avatar Jul 18 '25 08:07 jnovy

The PR got merged, closing.

jnovy avatar Jul 30 '25 06:07 jnovy