standards icon indicating copy to clipboard operation
standards copied to clipboard

Add draft for VM time synchronisation decisions

Open kgube opened this issue 1 year ago • 3 comments

https://github.com/SovereignCloudStack/issues/issues/231

kgube avatar Apr 24 '24 11:04 kgube

Had a discussion with @kgube about the motivation , goals and the contents of this DR.

Input/framework conditions that may be useful: (needs to be evaluated)

  • Its good to add reference to software-systems which might be used by cusomers that use shared quorum algorithms and reference to the relevance of good system time (Zookeper, RabbitMQ, ETCD, Consul, Hazelcast, Ceph)
  • Its good to add a reference that using public (internet) NTP servers with the same S-NAT IP might lead to ratelimit situations if dozens of systems in a project are using the same ntp servers because the the NTP servers might see the same IP with dozens of NTP sessions
  • SCS environments itself should be operated with at least 3 central and CSP-local NTP sources (for Ceph, RabbitMQ, ...)
  • Whether overcommit or that a VM is not “scheduled” plays a role for the quality of the time synchronization with the virtualization used must not matter to the user
  • The CSP offers at least three local and not rate limited NTP servers that have at least 5 statically defined upstream stratum servers or local time sources with high quality
  • We can define a minimum quality that is based on the requirements of common systems and provides some reserve to keep popular systems running without problems (offset, jitter, frequency drift, ...)
  • The CSP ensures that a time with a minimum quality can be maintained in VMs with a reference setup
    • defined chrony setup/configuration that uses the min. 3 CSP NTP servers
    • this should be possible with all flavors (in some virtualization technologies the size of the virtual machine has impact to the scheduling of it and related to that to its time sychronization
    • the health check service activates several VMs with a single defined flavor distributed across the CSP landscape (e.g. 3) that run permanently and checks their quality to evaluate the compliance
  • Subordinate, but exciting would be a idea how to provide the flavor images with a standardized setup by default which can be used independent from the CSP (e.g. by using a standardized setup mechanism, or standardized references to the servers)

scoopex avatar Aug 21 '24 10:08 scoopex

I discussed the potential upstream topic with Neutron Team, and created an RFE issue for it.

The topic will also be discussed during the PTG, it is currently scheduled for the 2014-10-24 15:00 - 16:00 UTC timeslot.

kgube avatar Oct 23 '24 07:10 kgube

I could not attend the PTG unfortunately, but the Topic was discussed and there were some questions on the scope of the feature that were forwarded to the RFE ticket, which I answered. In particular, both ovn and dnsmasq allow global dhcp-options, so provided that the link-local NTP server address is the same in all subnets (which would be a design goal), we can configure it as a global option and there is no dynamic port-specific DHCP-config necessary.

If we want to proceed with pursuing this feature, it would probably best to track it in a separate issue. The next step would be to take the RFE Ticket to a Neutron Drivers meeting, get affirmation of the scope of the feature from the team, and ask for guidance on how to proceed with the implementation.

kgube avatar Feb 19 '25 12:02 kgube