open-vm-tools
open-vm-tools copied to clipboard
post-customize.service does not proper work due to time sync via ntp or chronyc after reboot
Describe the bug
I am not sure, but i think that sometimes i had bug with post-customize service due to time synchronization
My case: RHEL 7.9.
On first boot (without network) chronyc does not work.
From logs: /var/log/vmware-imc/toolsDeployPkg.log (My local time +3ITC, so, 14:10 is 17:10)
2023-03-16T14:10:52 DEBUG: Command: '/bin/touch /.guest-customization-post-reboot-pending'
2023-03-16T14:10:52 DEBUG: Exit Code: 0
After that server go to reboot:
[2023-03-16T14:10:52.393Z] [ debug] Command to exec : '/sbin/telinit'.
Rebooted, then server started and post-customize service and cronyc stared simultaneously.
Mar 16 17:11:04 vm-lt2-bd-kafka-1609 systemd[1]: Started Run post-customization script.
Mar 16 17:11:04 vm-lt2-bd-kafka-1609 chronyd[739]: chronyd version 3.4 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +SECHASH +IPV6 +DEBUG)
But chronyc change my time:
Mar 16 17:11:08 vm-lt2-bd-kafka-1609 chronyd[739]: System clock wrong by -9.988691 seconds, adjustment started
Mar 16 17:10:58 vm-lt2-bd-kafka-1609 chronyd[739]: System clock was stepped by -9.988691 seconds
And /root/.customization/post-customize-guest.sh due to following line in script: if [ $now -lt $guardTime ]; then
Exit, because current time ($now) less or equal to creation time +10 sec of '/.guest-customization-post-reboot-pending' file
Logs:
Mar 16 17:10:59 customize-guest: Post-reboot agent started
Mar 16 17:10:59 customize-guest: The argument is ''
Mar 16 17:10:59 customize-guest: Running as non-service
Mar 16 17:10:59 customize-guest: Execution continues...
Mar 16 17:10:59 customize-guest: Waiting the GOS to reboot. Skipping
How can we avoid this? I think than post-customize-guest.sh must start before ntp\chrony?
Reproduction steps
Configure chronyc and make hw clock "real time + 1 min"
Expected behavior
Post-customize will work correctly
Additional context
FIX: maybe change post-customize service to start before ntp\chronyc\systemd-timesync?
No response
Hi 4815162342lost,
Yes, post-customize service and cronyc stared simultaneously lead to this issue. So far, it seems that change post-customize service to start before ntp\chronyc\systemd-timesync is a good idea. I am investigating to add Before=chronyd.service in post-customize.service, I will reply you later when having some results.
Best Regards, Cheng Cheng
Hi 4815162342lost,
Here is a workaround to run the post-customize in the current problematic scenario. Could you try to reboot the vm again when the issue happened and check whether the post-customize is running after this time reboot? As when the issue happened, the post customize script was skipped as the log "Waiting the GOS to reboot. Skipping" mentioned, but the guest-customization-post-reboot-pending is still exist, if you reboot the vm again, the post-customize script should run again.
Best Regards, Cheng Cheng
Hello, thanks for your reply.
Yes, your workaround works, i checked earlier.
But we have several automation scripts and etc. Workaround unwanted for our infrastructure.
But if there no any proper way to fix issue, we can add "Second reboot" to our infrastructure.
Or could you add "Second reboot" if needed to /root/.customization/post-customize-guest.sh script? /root/.customization/post-customize-guest.sh should check, if issue with time sync occurred, and make second reboot or restart post-customize service after several time if needed. Or something else.
But main problem, workaround will not work, if chronyc will change time to significant amount (for example, on 5-10 min.) I.e. need "proper fix".
Hi 4815162342lost, Yes, the 2nd reset workaround is only a temporary way, we will further investigate how to fix the problem of this scenario from the code. Thanks for the details.