azure-container-networking
azure-container-networking copied to clipboard
feat: refactor cni telemetry
Reason for Change:
Currently the telemetry CNI is sending is insufficient to debug CNI issues. This PR refactors the cni telemetry to send more and better quality logs.
- Moves telemetry into a package level variable so it is made accessible everywhere
- Removes sending certain metrics as they are not used
- Sets the subcontext to the container id. The container id is kept consistent throughout CNI calls for the same pod, meaning an ADD and DEL call (and all related logs) for the same pod will have the same subcontext/container id. The container id is also what is stored in stateless mode as one of the keys.
- Sets the operation id before any telemetry events are sent. The operation id is used for sampling should we end up enabling it.
Examples of Logged information (Will be added in a separate PR-- this PR is focused on refactoring)
- CNI add network configuration, arguments
- CNI add completion with endpoint info struct information (contains hns endpoint id and hns network id), interface results from the ipam invoker, and any error that occurred
- CNI del network configuration, arguments
- CNI del completion with error that occurred
- HNS Endpoint struct before creation / HNS Endpoint Id during deletion
- HNS Network struct before creation / HNS Network Id during deletion
- Deletion/Release of each IP (even if does not exist)
- Mapping sent to CNS during stateless CNI mode during Update Endpoint State
- Exact CNS response from CNS ipam invoker
- Exact CNS response from multitenancy ipam invoker
- Transparent vlan creating/deleting vlan veth interface
Potential additions:
- endpoint and network structs saved to azure-vnet.json statefile
Issue Fixed:
Requirements:
- [X] uses conventional commit messages
- [ ] includes documentation
- [x] adds unit tests
- [X] relevant PR labels added
Notes: Pipeline run to prove logs sent to kusto: https://msazure.visualstudio.com/One/_build/results?buildId=108208651&view=results Passing run: https://msazure.visualstudio.com/One/_build/results?buildId=108563465&view=results