SONiC [wip] add k8s integration design doc

Signed-off-by: Wataru Ishida [email protected]

Mar 20 '19 00:03 ishidawataru

support switching between standalone mode and cluser mode in runtime

You mentioned there are 2 modes. Are they mutually exclusive? Can we have them at the same time?
support cluster joining mechanism for newly added switch

Either a switch could identify its cluster master, or a cluster master could discover or identify all the switches. Do you have a design?
How much disk space/CPU/RAM of switch is needed?
Can we have a terminology explanation for SONiC use case? Such as pod, cluster, service, deployment, etc.
What is the scalability of single master cluster? How to manage them if we must use multiple masters?
Do you assume all the switches and the master in a cluster are in one layer 2 network?
What is the process for switch to upgrade its whole image?
What is the process for master to upgrade its k3s package?
What is the process for switch to upgrade its k3s package/docker?
before joining to the cluster, we need to stop the containers which are currently running on the node since the master will start deploying the same containers in this node

This is very bad for swss, syncd, teamd, bgp dockers. Can we relax it?
In long future, we may add new docker container into switch image. Could master manages totally different switch images in a cluster?

Mar 25 '19 19:03 qiluo-msft

@qiluo-msft Thanks for the comment. I added a glossary to the doc.

You mentioned there are 2 modes. Are they mutually exclusive? Can we have them at the same time?

It should be mutually exclusive. In standalone mode, a switch itself has the control of the containers which run on it. In cluster mode, the k8s controller has the control of it.

Either a switch could identify its cluster master, or a cluster master could discover or identify all the switches. Do you have a design?

The joining procedure needs to be invoked by a switch. So a switch should identify its cluster master, get the token, and ask the master to join the cluster. This could be done in some ZTP ( or Ansible? ) procedure.

How much disk space/CPU/RAM of switch is needed?

Added in the document. Not sure about CPU usage. In my experience, CPU usage never became a problem.

Can we have a terminology explanation for SONiC use case? Such as pod, cluster, service, deployment, etc.

Added in the document.

What is the scalability of single master cluster? How to manage them if we must use multiple masters?

The official document says it can scale up to 5000 nodes. However, I think this number really depends on the environment. Also k3s is using sqlite3 as the default internal DB instead of etcd which should also affect the performance.

Do you assume all the switches and the master in a cluster are in one layer 2 network?

No. k8s master only needs IP reachability to control nodes.

What is the process for switch to upgrade its whole image?

The easiest way would be unjoining the node from the cluster and join again after the upgrade.

What is the process for master to upgrade its k3s package?

T.B.D. I'll investigate what k3s is offering. Does SONiC have a mechanism to upgrade docker?

What is the process for switch to upgrade its k3s package/docker?

T.B.D. I'll investigate what k3s is offering.

This is very bad for swss, syncd, teamd, bgp dockers. Can we relax it?

Can't we use warm reboot for the transition as we did at the hackathon?

In long future, we may add new docker container into switch image. Could master manages totally different switch images in a cluster?

Yes, as I described, this can be supported by using selector and labels.

Mar 25 '19 21:03 ishidawataru

before upgrading container, the controller may need to do some actions, such as take bgp snapshot, drain traffic from the switch. after upgrade container, the controller may need to do some post upgrade actions, such as comparing the snapshot, restore traffic.

any consideration for such actions supported by k8s?

Mar 28 '19 07:03 lguohan

@lguohan

We can use postStart and preStop hook to invoke such actions

https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods
https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks

Also, we can use init-containers to do some tasks before kubelet starts containers

https://kubernetes.io/docs/concepts/workloads/pods/init-containers/

In k8s, this kind of application specific operations can be implemented as an operator

https://coreos.com/operators/
https://github.com/operator-framework

Apr 01 '19 21:04 ishidawataru

Is this dead?

Nov 17 '22 22:11 MikeZappa87

SONiC SONiC copied to clipboard

[wip] add k8s integration design doc

SONiC
SONiC copied to clipboard