Issue Report

Bug

CoreOS Version

NAME=CoreOS
ID=coreos
VERSION=1192.2.0
VERSION_ID=1192.2.0
BUILD_ID=2016-10-21-0026
PRETTY_NAME="CoreOS 1192.2.0 (MoreOS)"
ANSI_COLOR="1;32"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://github.com/coreos/bugs/issues"

Environment

AWS EC2, m4.large

Expected Behavior

Don't run systemd-timesync all the time ;)

Actual Behavior

systemd-timesyncs runs +- every 2 seconds, we get about 40k of the following per day on 1 of our etcd masters

December 13th 2016, 15:44:41.379	systemd-timesyncd	Synchronized to time server 52.17.30.119:123 (0.coreos.pool.ntp.org).	prd-1-master-10.1.21.150
December 13th 2016, 15:44:41.379	systemd-timesyncd	Synchronized to time server 52.17.30.119:123 (0.coreos.pool.ntp.org).	prd-1-master-10.1.21.150
December 13th 2016, 15:44:41.379	systemd-timesyncd	Network configuration changed, trying to establish connection.	prd-1-master-10.1.21.150
December 13th 2016, 15:44:41.378	systemd-timesyncd	Network configuration changed, trying to establish connection.	prd-1-master-10.1.21.150

Reproduction Steps

Run CoreOS 1192.2.0 configured for central services role and look at the logs. It actually only happens on 1 of our 3 central servers. The other have about 4k of the above lines per day. Still too much imho, but not such a big issues as the 40k lines on the other central server.

Dec 13 '16 15:12 simonvanderveldt

systemd-timesyncd runs this codepath every time the network links change (it does an inotify on /run/systemd/netif/links/). This seems to happen once a day on my machines. Your network interfaces probably shouldn't be changing that often. Can you provide more details about your environment? My first guess is that DHCP is renewing the lease often.

If your goal is just to reduce the logging, you can adjust the logging level for that service. Those messages are logged at info, so if you raise it to notice, they'll go away.

Dec 13 '16 23:12 crawford

@crawford thanks for the pointer (especially that systemd-timesyncd runs when the network links change), we'll invest time tomorrow to find the cause of this, though I expect it to be related to containers spawning and dying. Will report back once we know.

We're also considering just switching back to ntpd, would there be a reason not to do so?

Dec 14 '16 18:12 simonvanderveldt

I expect it to be related to containers spawning and dying

I just tested that theory and you are correct. I see three instances when I create an interactive container and two more when it's destroyed. Ouch.

I was able to suppress the logs by creating the following drop-in at /etc/systemd/system/systemd-timesyncd.service.d/logging.conf:

[Service]
Environment=SYSTEMD_LOG_LEVEL=notice

Dec 14 '16 22:12 crawford

@crawford we checked this as well and it's indeed tied to the lifecycle of containers and their veth* devices.

Below a docker run followed by a docker rm -f with systemd-timesyncd log lines in between: We get 3 timesyncd runs for both the create and destroy event.

Dec 16 10:52:09 core-01 dockerd[1254]: time="2016-12-16T10:52:09.941840764+01:00" level=info msg="POST /v1.21/containers/create"
Dec 16 10:52:09 core-01 dockerd[1254]: time="2016-12-16T10:52:09.972691946+01:00" level=info msg="POST /v1.21/containers/f65a3d083b99ec342c432448546d0524ff56d674937d4c3fd95754b024101a62/start"
Dec 16 10:52:09 core-01 systemd-timesyncd[585]: Network configuration changed, trying to establish connection.
Dec 16 10:52:09 core-01 systemd-timesyncd[585]: Synchronized to time server 109.236.83.221:123 (0.coreos.pool.ntp.org).
Dec 16 10:52:09 core-01 systemd-timesyncd[585]: Network configuration changed, trying to establish connection.
Dec 16 10:52:10 core-01 systemd-timesyncd[585]: Synchronized to time server 109.236.83.221:123 (0.coreos.pool.ntp.org).
Dec 16 10:52:10 core-01 systemd-timesyncd[585]: Network configuration changed, trying to establish connection.
Dec 16 10:52:10 core-01 systemd-timesyncd[585]: Synchronized to time server 109.236.83.221:123 (0.coreos.pool.ntp.org).
Dec 16 10:52:35 core-01 dockerd[1254]: time="2016-12-16T10:52:35.350604263+01:00" level=info msg="DELETE /v1.21/containers/f65a3d083b99?force=1"
Dec 16 10:52:35 core-01 systemd-timesyncd[585]: Network configuration changed, trying to establish connection.
Dec 16 10:52:35 core-01 systemd-timesyncd[585]: Synchronized to time server 109.236.83.221:123 (0.coreos.pool.ntp.org).
Dec 16 10:52:35 core-01 systemd-timesyncd[585]: Network configuration changed, trying to establish connection.
Dec 16 10:52:35 core-01 systemd-timesyncd[585]: Synchronized to time server 109.236.83.221:123 (0.coreos.pool.ntp.org).
Dec 16 10:52:36 core-01 systemd-timesyncd[585]: Network configuration changed, trying to establish connection.
Dec 16 10:52:36 core-01 systemd-timesyncd[585]: Synchronized to time server 109.236.83.221:123 (0.coreos.pool.ntp.org).

Also, while changing the log level gets rid of the log lines the actual NTP syncing still happens, which obviously shouldn't be the case at all, it's unnecessary I/O and you're effectively hammering the NTP servers. See https://bugs.freedesktop.org/show_bug.cgi?id=87505#c7 for a similar issue and a comment from @poettering about this.

I've been trying to find a way disable the timesyncd trigger when a network config changes, but no luck. It seems it's not possible. @crawford I think this needs a fix in CoreOS, but since it's a systemd issue/missing feature it might warrant an upstream systemd bug?

For now we'll switch back to ntpd, maybe CoreOS should do the same?

Dec 16 '16 10:12 simonvanderveldt

I'm experiencing this exact issue as well. I only have ~10 containers on my node, which all restarted this morning due to unrelated problem:

$ journalctl --since=2017-01-04 | grep timesync | wc -l
480

CoreOS beta (1235.2.0)

Jan 04 '17 09:01 joonas-fi

veth* is not managed by networkd, but networkd still creates a status file in /run/systemd/netif/links/.

Feb 06 '17 21:02 bgilbert

Facing the same issue here. Going to move back to NTPD as well.

Mar 19 '18 13:03 jbguerraz

hmm, if this is still relevant, please file a bug on systemd github, explaining the situation briefly. THis looks like something we should fix in timesyncd if this is still an issue. That said, https://bugs.freedesktop.org/show_bug.cgi?id=87505 has long been closed under the assumption this was fixed comprehensively.

Mar 19 '18 22:03 poettering

@poettering Yes this is still valid.

[root@abf bin]# journalctl --since=2018-01-04 | grep timesync | wc -l 9840

Mar 28 '18 14:03 tpgxyz

@tpgxyz Thanks for reporting upstream!

https://github.com/systemd/systemd/pull/8611 should fix the issue for systemd 239.

Apr 04 '18 22:04 bgilbert

Is it possible to backport patch to systemd 238 in stable?

Oct 08 '18 12:10 samm-git

bugs
bugs copied to clipboard

systemd-timesyncd executes timesyncs when creating and destroying a Docker container

Issue Report

Bug

CoreOS Version

Environment

Expected Behavior

Actual Behavior

Reproduction Steps

bugs bugs copied to clipboard

systemd-timesyncd executes timesyncs when creating and destroying a Docker container

Issue Report

Bug

CoreOS Version

Environment

Expected Behavior

Actual Behavior

Reproduction Steps

bugs
bugs copied to clipboard