mirror-registry
mirror-registry copied to clipboard
Containers are running but registry is unresponsive at some point after installation
All the containers are running but registry is unresponsive at some point after installation.
(no response at curl https://localhost:8443)
I have to restart the pods or even have to reboot the host to get it working.
All the pods are running:
[root@bastion ~]# podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
db266da38b9c registry.access.redhat.com/ubi8/pause:8.7-6 infinity 13 hours ago Up 13 hours 0.0.0.0:8443->8443/tcp 5e70ee01733b-infra
767d8f665354 registry.redhat.io/rhel8/redis-6:1-92.1669834635 run-redis 13 hours ago Up 13 hours 0.0.0.0:8443->8443/tcp quay-redis
73b03983db2f registry.redhat.io/rhel8/postgresql-10:1-203.1669834630 run-postgresql 13 hours ago Up 13 hours 0.0.0.0:8443->8443/tcp quay-postgres
41c21e84bb3e registry.redhat.io/quay/quay-rhel8:v3.8.14 registry 13 hours ago Up 13 hours 0.0.0.0:8443->8443/tcp quay-app
New logs are comming up, so the containers are running fine... I guess?
[root@bastion ~]# podman logs --tail=10 -f quay-app
exportactionlogsworker stdout | 2024-03-26 00:28:00,067 [52] [INFO] [apscheduler.executors.default] Running job "QueueWorker.poll_queue (trigger: interval[0:01:00], next run at: 2024-03-26 00:29:00 UTC)" (scheduled at 2024-03-26 00:28:00.067443+00:00)
exportactionlogsworker stdout | 2024-03-26 00:28:00,071 [52] [INFO] [apscheduler.executors.default] Job "QueueWorker.poll_queue (trigger: interval[0:01:00], next run at: 2024-03-26 00:29:00 UTC)" executed successfully
notificationworker stdout | 2024-03-26 00:28:04,724 [63] [INFO] [apscheduler.executors.default] Running job "QueueWorker.poll_queue (trigger: interval[0:00:10], next run at: 2024-03-26 00:28:14 UTC)" (scheduled at 2024-03-26 00:28:04.724010+00:00)
notificationworker stdout | 2024-03-26 00:28:04,727 [63] [INFO] [apscheduler.executors.default] Job "QueueWorker.poll_queue (trigger: interval[0:00:10], next run at: 2024-03-26 00:28:14 UTC)" executed successfully
repositorygcworker stdout | 2024-03-26 00:28:11,768 [75] [INFO] [apscheduler.executors.default] Running job "QueueWorker.run_watchdog (trigger: interval[0:01:00], next run at: 2024-03-26 00:29:11 UTC)" (scheduled at 2024-03-26 00:28:11.767795+00:00)
repositorygcworker stdout | 2024-03-26 00:28:11,769 [75] [INFO] [apscheduler.executors.default] Job "QueueWorker.run_watchdog (trigger: interval[0:01:00], next run at: 2024-03-26 00:29:11 UTC)" executed successfully
gcworker stdout | 2024-03-26 00:28:12,861 [53] [INFO] [apscheduler.executors.default] Running job "GarbageCollectionWorker._garbage_collection_repos (trigger: interval[0:00:30], next run at: 2024-03-26 00:28:42 UTC)" (scheduled at 2024-03-26 00:28:12.860612+00:00)
gcworker stdout | 2024-03-26 00:28:12,868 [53] [INFO] [apscheduler.executors.default] Job "GarbageCollectionWorker._garbage_collection_repos (trigger: interval[0:00:30], next run at: 2024-03-26 00:28:42 UTC)" executed successfully
notificationworker stdout | 2024-03-26 00:28:14,724 [63] [INFO] [apscheduler.executors.default] Running job "QueueWorker.poll_queue (trigger: interval[0:00:10], next run at: 2024-03-26 00:28:24 UTC)" (scheduled at 2024-03-26 00:28:14.724010+00:00)
notificationworker stdout | 2024-03-26 00:28:14,731 [63] [INFO] [apscheduler.executors.default] Job "QueueWorker.poll_queue (trigger: interval[0:00:10], next run at: 2024-03-26 00:28:24 UTC)" executed successfully
Nothing strange on the quay-app container deatails.
[root@bastion ~]# podman inspect quay-app
[
{
"Id": "41c21e84bb3e90a2ae46b480d9ca00e1a924a27e2c20157f09d21d29c9b4a389",
"Created": "2024-03-25T07:50:17.451450987-04:00",
"Path": "dumb-init",
"Args": [
"--",
"/quay-registry/quay-entrypoint.sh",
"registry"
],
"State": {
"OciVersion": "1.1.0-rc.3",
"Status": "running",
"Running": true,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 7577,
"ConmonPid": 7575,
"ExitCode": 0,
"Error": "",
"StartedAt": "2024-03-25T07:50:17.61683645-04:00",
"FinishedAt": "0001-01-01T00:00:00Z",
"Health": {
"Status": "",
"FailingStreak": 0,
"Log": null
},
"CgroupPath": "/machine.slice/machine-libpod_pod_5e70ee01733b02f854d79d85dd78dc5c8ecdb2c50de7472a314441897f9296dc.slice/libpod-41c21e84bb3e90a2ae46b480d9ca00e1a924a27e2c20157f09d21d29c9b4a389.scope",
"CheckpointedAt": "0001-01-01T00:00:00Z",
"RestoredAt": "0001-01-01T00:00:00Z"
},
"Image": "93b30dda302e3554fcfea484da1fc7b981dc4ac173b195def4ab79b86dfaf616",
"ImageDigest": "sha256:19e0709632a860dc93e54e9d79b8da9b02334122775932eaefaccf4783524ef4",
"ImageName": "registry.redhat.io/quay/quay-rhel8:v3.8.14",
"Rootfs": "",
"Pod": "5e70ee01733b02f854d79d85dd78dc5c8ecdb2c50de7472a314441897f9296dc",
"ResolvConfPath": "/run/containers/storage/overlay-containers/db266da38b9c0ffd99a27f0873934a79cbf7776dd8996aa0e4b839f98f0b25ec/userdata/resolv.conf",
"HostnamePath": "/run/containers/storage/overlay-containers/41c21e84bb3e90a2ae46b480d9ca00e1a924a27e2c20157f09d21d29c9b4a389/userdata/hostname",
"HostsPath": "/run/containers/storage/overlay-containers/db266da38b9c0ffd99a27f0873934a79cbf7776dd8996aa0e4b839f98f0b25ec/userdata/hosts",
"StaticDir": "/var/lib/containers/storage/overlay-containers/41c21e84bb3e90a2ae46b480d9ca00e1a924a27e2c20157f09d21d29c9b4a389/userdata",
"OCIConfigPath": "/var/lib/containers/storage/overlay-containers/41c21e84bb3e90a2ae46b480d9ca00e1a924a27e2c20157f09d21d29c9b4a389/userdata/config.json",
"OCIRuntime": "crun",
"ConmonPidFile": "/run/quay-app.service-pid",
"PidFile": "/run/containers/storage/overlay-containers/41c21e84bb3e90a2ae46b480d9ca00e1a924a27e2c20157f09d21d29c9b4a389/userdata/pidfile",
"Name": "quay-app",
"RestartCount": 0,
"Driver": "overlay",
"MountLabel": "system_u:object_r:container_file_t:s0:c273,c984",
"ProcessLabel": "system_u:system_r:container_t:s0:c273,c984",
"AppArmorProfile": "",
"EffectiveCaps": null,
"BoundingCaps": [
"CAP_CHOWN",
"CAP_DAC_OVERRIDE",
"CAP_FOWNER",
"CAP_FSETID",
"CAP_KILL",
"CAP_NET_BIND_SERVICE",
"CAP_SETFCAP",
"CAP_SETGID",
"CAP_SETPCAP",
"CAP_SETUID",
"CAP_SYS_CHROOT"
],
"ExecIDs": [],
"GraphDriver": {
"Name": "overlay",
"Data": {
"LowerDir": "/var/lib/containers/storage/overlay/19dbf084110759a3d249cd4ec487e83f55eca64deafc5d51d04787a3716fadb8/diff",
"MergedDir": "/var/lib/containers/storage/overlay/fc1f2d2a88e454e8c41e3aa22e5d91e18001506f13821dd60eee47a918b1bc50/merged",
"UpperDir": "/var/lib/containers/storage/overlay/fc1f2d2a88e454e8c41e3aa22e5d91e18001506f13821dd60eee47a918b1bc50/diff",
"WorkDir": "/var/lib/containers/storage/overlay/fc1f2d2a88e454e8c41e3aa22e5d91e18001506f13821dd60eee47a918b1bc50/work"
}
},
"Mounts": [
{
"Type": "volume",
"Name": "f19507ef7f837c63cb92f116e042f12daa4c00a0c37c444cb1c7988687e66a0d",
"Source": "/var/lib/containers/storage/volumes/f19507ef7f837c63cb92f116e042f12daa4c00a0c37c444cb1c7988687e66a0d/_data",
"Destination": "/tmp",
"Driver": "local",
"Mode": "",
"Options": [
"nodev",
"exec",
"nosuid",
"rbind"
],
"RW": true,
"Propagation": "rprivate"
},
{
"Type": "volume",
"Name": "63e0413f366aa2f74f9370d04014e48038006bb4cf1b2ff5435fc9cb724de3ce",
"Source": "/var/lib/containers/storage/volumes/63e0413f366aa2f74f9370d04014e48038006bb4cf1b2ff5435fc9cb724de3ce/_data",
"Destination": "/var/log",
"Driver": "local",
"Mode": "",
"Options": [
"nodev",
"exec",
"nosuid",
"rbind"
],
"RW": true,
"Propagation": "rprivate"
},
{
"Type": "volume",
"Name": "097a7e8bf2e6d0a80a575d14bd6bdfa58d16919ff83a9b403d6dc06915ae20bc",
"Source": "/var/lib/containers/storage/volumes/097a7e8bf2e6d0a80a575d14bd6bdfa58d16919ff83a9b403d6dc06915ae20bc/_data",
"Destination": "/conf/stack",
"Driver": "local",
"Mode": "",
"Options": [
"nodev",
"exec",
"nosuid",
"rbind"
],
"RW": true,
"Propagation": "rprivate"
},
{
"Type": "bind",
"Source": "/opt/quay/config/quay-config",
"Destination": "/quay-registry/conf/stack",
"Driver": "",
"Mode": "",
"Options": [
"rbind"
],
"RW": true,
"Propagation": "rprivate"
},
{
"Type": "bind",
"Source": "/opt/quay/data",
"Destination": "/datastorage",
"Driver": "",
"Mode": "",
"Options": [
"rbind"
],
"RW": true,
"Propagation": "rprivate"
}
],
"Dependencies": [
"db266da38b9c0ffd99a27f0873934a79cbf7776dd8996aa0e4b839f98f0b25ec"
],
"NetworkSettings": {
"EndpointID": "",
"Gateway": "10.88.0.1",
"IPAddress": "10.88.0.2",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "a6:9c:af:e1:1b:a7",
"Bridge": "",
"SandboxID": "",
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"Ports": {
"8443/tcp": [
{
"HostIp": "",
"HostPort": "8443"
}
]
},
"SandboxKey": "/run/netns/netns-67bc251f-bac0-1817-c280-f49b54fda5bc",
"Networks": {
"podman": {
"EndpointID": "",
"Gateway": "10.88.0.1",
"IPAddress": "10.88.0.2",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "a6:9c:af:e1:1b:a7",
"NetworkID": "podman",
"DriverOpts": null,
"IPAMConfig": null,
"Links": null,
"Aliases": [
"db266da38b9c",
"quay-pod"
]
}
}
},
"Namespace": "",
"IsInfra": false,
"IsService": false,
"KubeExitCodePropagation": "invalid",
"lockNumber": 37,
"Config": {
"Hostname": "quay-pod",
"Domainname": "",
"User": "1001",
"AttachStdin": false,
"AttachStdout": false,
"AttachStderr": false,
"Tty": false,
"OpenStdin": false,
"StdinOnce": false,
"Env": [
"LANG=C.UTF-8",
"QUAYDIR=/quay-registry",
"PYTHONUNBUFFERED=1",
"RED_HAT_QUAY=true",
"TERM=xterm",
"container=oci",
"PYTHONIOENCODING=UTF-8",
"LC_ALL=C.UTF-8",
"TZ=UTC",
"PYTHONUSERBASE=/app",
"QUAYPATH=/quay-registry",
"QUAYCONF=/quay-registry/conf",
"PATH=/app/bin/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"QUAYRUN=/quay-registry/conf",
"PYTHONPATH=/quay-registry",
"HOME=/quay-registry",
"HOSTNAME=quay-pod"
],
"Cmd": [
"registry"
],
"Image": "registry.redhat.io/quay/quay-rhel8:v3.8.14",
"Volumes": null,
"WorkingDir": "/quay-registry",
"Entrypoint": "dumb-init -- /quay-registry/quay-entrypoint.sh",
"OnBuild": null,
"Labels": null,
"Annotations": {
"io.container.manager": "libpod",
"io.kubernetes.cri-o.SandboxID": "db266da38b9c0ffd99a27f0873934a79cbf7776dd8996aa0e4b839f98f0b25ec",
"io.podman.annotations.cid-file": "/run/quay-app.service-cid",
"org.opencontainers.image.stopSignal": "15"
},
"StopSignal": 15,
"HealthcheckOnFailureAction": "none",
"CreateCommand": [
"/usr/bin/podman",
"run",
"--name",
"quay-app",
"-v",
"/opt/quay/config/quay-config:/quay-registry/conf/stack:Z",
"-v",
"/opt/quay/data:/datastorage:Z",
"--pod=quay-pod",
"--conmon-pidfile",
"/run/quay-app.service-pid",
"--cidfile",
"/run/quay-app.service-cid",
"--cgroups=no-conmon",
"--replace",
"registry.redhat.io/quay/quay-rhel8:v3.8.14"
],
"Umask": "0022",
"Timeout": 0,
"StopTimeout": 10,
"Passwd": true,
"sdNotifyMode": "container"
},
"HostConfig": {
"Binds": [
"f19507ef7f837c63cb92f116e042f12daa4c00a0c37c444cb1c7988687e66a0d:/tmp:rprivate,rw,nodev,exec,nosuid,rbind",
"63e0413f366aa2f74f9370d04014e48038006bb4cf1b2ff5435fc9cb724de3ce:/var/log:rprivate,rw,nodev,exec,nosuid,rbind",
"097a7e8bf2e6d0a80a575d14bd6bdfa58d16919ff83a9b403d6dc06915ae20bc:/conf/stack:rprivate,rw,nodev,exec,nosuid,rbind",
"/opt/quay/config/quay-config:/quay-registry/conf/stack:rw,rprivate,rbind",
"/opt/quay/data:/datastorage:rw,rprivate,rbind"
],
"CgroupManager": "systemd",
"CgroupMode": "private",
"ContainerIDFile": "/run/quay-app.service-cid",
"LogConfig": {
"Type": "journald",
"Config": null,
"Path": "",
"Tag": "",
"Size": "0B"
},
"NetworkMode": "container:db266da38b9c0ffd99a27f0873934a79cbf7776dd8996aa0e4b839f98f0b25ec",
"PortBindings": {},
"RestartPolicy": {
"Name": "",
"MaximumRetryCount": 0
},
"AutoRemove": false,
"VolumeDriver": "",
"VolumesFrom": null,
"CapAdd": [],
"CapDrop": [],
"Dns": [],
"DnsOptions": [],
"DnsSearch": [],
"ExtraHosts": [],
"GroupAdd": [],
"IpcMode": "container:db266da38b9c0ffd99a27f0873934a79cbf7776dd8996aa0e4b839f98f0b25ec",
"Cgroup": "",
"Cgroups": "default",
"Links": null,
"OomScoreAdj": 0,
"PidMode": "private",
"Privileged": false,
"PublishAllPorts": false,
"ReadonlyRootfs": false,
"SecurityOpt": [],
"Tmpfs": {},
"UTSMode": "container:db266da38b9c0ffd99a27f0873934a79cbf7776dd8996aa0e4b839f98f0b25ec",
"UsernsMode": "",
"ShmSize": 65536000,
"Runtime": "oci",
"ConsoleSize": [
0,
0
],
"Isolation": "",
"CpuShares": 0,
"Memory": 0,
"NanoCpus": 0,
"CgroupParent": "machine.slice/machine-libpod_pod_5e70ee01733b02f854d79d85dd78dc5c8ecdb2c50de7472a314441897f9296dc.slice",
"BlkioWeight": 0,
"BlkioWeightDevice": null,
"BlkioDeviceReadBps": null,
"BlkioDeviceWriteBps": null,
"BlkioDeviceReadIOps": null,
"BlkioDeviceWriteIOps": null,
"CpuPeriod": 0,
"CpuQuota": 0,
"CpuRealtimePeriod": 0,
"CpuRealtimeRuntime": 0,
"CpusetCpus": "",
"CpusetMems": "",
"Devices": [],
"DiskQuota": 0,
"KernelMemory": 0,
"MemoryReservation": 0,
"MemorySwap": 0,
"MemorySwappiness": 0,
"OomKillDisable": false,
"PidsLimit": 2048,
"Ulimits": [
{
"Name": "RLIMIT_NPROC",
"Soft": 4194304,
"Hard": 4194304
}
],
"CpuCount": 0,
"CpuPercent": 0,
"IOMaximumIOps": 0,
"IOMaximumBandwidth": 0,
"CgroupConf": null
}
}
]
Hey team, we just ran into this same exact issue, same symptoms as well. I thought perhaps we just had a one-off issue, but then noticed this issue, so I thought I'd add a comment. I'll get some troubleshooting logs posted here. I can connect via netcat to port 8443 and have ruled out selinux, fapolicyd, etc as potential contributors.
It just.... stops responding to http traffic.
I should have captured the output, but failed to - I did notice that a curl results in something similar to the following:
curl -vvv https://<quay-server>:8443 | head
* Rebuilt URL to: https://<quay-server>:8443/
* TCP_NODELAY set
* Connected to <quay-server> port 8443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/pki/tls/certs/ca-bundle.crt
CApath: none
} [5 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
< hangs right here where we should get a Server hello>
We never get the server hello back, nor anything beyond that - and, as noted above the port is open and responds via nc and the logs keep on rolling by for journalctl -fu quay-app.service or podman logs -f <pod_id>