antrea
antrea copied to clipboard
Use Antrea Agent to manage External Node
Describe what you are trying to solve
External Node means the object which is not a Node in the K8s cluster, and it could be a VM or a BM. Use Antrea Agent to manage External Node means that Antrea Agent is running on the External Node host, and is able to realize the ANP rules which are applied to it and able to provide the support bundle to help trouble shooting.
Describe the solution you have in mind
A new property "role" is introduced to Antrea Agent to tell Agent on what node it is running. By now the values of agent role should include, Node (the default value, meaning Agent is working on a K8s worker Node), VM, and BM. One K8s cluster and Antrea Controller are still needed, and Antrea Agent running on an External Node also connects to Kube APIServer and Antrea Controller. But note that the External Node doesn't join this cluster.
ExternalEntity is used to describe which network interface and IP resources on the External Node. One External Node is able to have multiple ExternalEntity objects, and the ExernalNode property of these objects are the same, which is used to identify the External Node object. The ExernalNode value is also used by Antrea Agent to represent its name. ExternalEntity can be used as the AppliedTo field in an ANP rule, the existing API for ANP needs update to support configuring this item. Administrator creates ANP and ExternalEntity objects in the K8s cluster.
Antrea Agent gets/watches/lists ExternalEntity objects from Antrea Controller via a new internal API, and Antrea Controller only give the ExternalNode relevant ExternalEntity objects to Agent by computing the span. This means the ExternalEntity get/watch/list from K8s APIServer permissions are not needed for the serviceAccount who runs Antrea Agent on External Node, as a result, the person who gets the serviceAccount token can't know the ExternalEntity objects out of the current host.
Antrea Agent uses OVS to realize the ExternalEntity by attaching the network interface on OVS bridge as uplink. At the same time, Antrea Agent creates a new virtual interface on the host which is used to take over the name/IP/MAC/Route configurations which is on the uplink. To ensure the latter virtual interface is able to configure with uplink's name, Antrea Agent rename the uplink before attaching to OVS. Antrea uses ExternalEntity.Endpoint IP and/or name to find the uplink on the host. One InterfaceConfig object for the pair of host interface and uplink is created and stored in the memory, which is used when realizing NetworkPolicy.
New OVS pipeline is introduced for External Node scenario. AntreaProxy/Egress/AntreaIPAM/Multicast are not required for External Node, so the flow tables should not realize on the OVS. Flexible pipeline is used as the framework to setup OVS pipeline. Packets between the uplink and host interface forwarded directly if no ANP rule is hit.
AntreaAgentInfo is still used by Antrea Agent to report agent status. The serviceaccount who runs Antrea Agent on the External Node is given the privilege to create/update AntreaAgentInfo. For security perspective, adminission webhook validation is introduced on Antrea Controller before take the real actions on create/update AntreaAgentInfo. Administrator should create a new CR to decride the VMs (ExternalNode values) allowed to a service account. Antrea Controller compares the user of AntreaAgentInfo create/update request and its name in the validation.
antctl is also supported on the External Node. For security perspective, the APIServer runs by Antrea Agent should only serve the antctl request from localhost. There are two options involved in this feature: 1) use K8s APIServer lib to listen at localhost, 2) use UDS to implement a new APIServer.
Since the APIServer on External Node is disallowed to serve remote requests, a new CRD is thought to be introduced to tell Agent to collect support bundle files.
Both Linux and Windows are supported working as External Node hosts.
Describe how your solution impacts user flows
Administrator should create a CR object to define the mapping of allowed service account and his/her allowed ExternalNode names Antrea Agent reads the External Node value from environment, so the user should pre-configure the environment varible on the host where Antrea Agent is running. OVS is required to install on the host before running Antrea Agent. Antrea Agent is running as a process on the host. User could create system service for antrea agent or run the process directly according to his/her requirement.
Describe the main design/architecture of your solution
More details about the design please refer to https://docs.google.com/document/d/1hEJ63k0tcrLdyU76EmPFvjwcIrlDLEu9LIuZSqE58c4/edit?usp=sharing
Test plan
New e2e tests should introduce to cover the External Node scenario. A K8s cluster is required, and two VMs are needed which are not the worker nodes. Antrea Agent is configured as VM role on the two VMs. Then run Antrea Agent on the VMs. ANP rules are corrected to apply on the VMs, and the rule should include egress/ingress, and remote address include ExternalEntity and IPBlock.
Work break down
- [x] Support configuration for Antrea Agent to run on an External Node
- [x] New CRD for External Node definition
- [x] Realize External Node by Antrea Agent
- [x] Support applying ANP to an External Node
- [x] CI support for External Node case
- [ ] Collect support bundle files using CRD configuration
- [ ] antctl for External Node case
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days