feat(advanced-metrics): Add Support for Topology (Availability) Zone Labels
Description
This PR introduces availability zone labels (topology.kubernetes.io/zone) for remote context advanced metrics.
With this PR, it is possible to add the zone to metrics as sourceLabel or destinationLabel via the MetricsConfiguration CRD.
The data is added via the enricher component to the RetinaMetadata in the Flow struct's Extensions attribute, because the node controller already has access to the labels of the Nodes.
Related Issue
https://github.com/microsoft/retina/issues/1654
Checklist
- [x] I have read the contributing documentation.
- [x] I signed and signed-off the commits (
git commit -S -s ...). See this documentation on signing commits. - [x] I have correctly attributed the author(s) of the code.
- [x] I have tested the changes locally.
- [x] I have followed the project's style guidelines.
- [x] I have updated the documentation, if necessary.
- [x] I have added tests, if applicable.
Screenshots (if applicable) or Testing Completed
Additional Notes
We have not added it to the local context because we don't see the value in adding just one zone to a pod. We have not added it to the hubble scenario because hubble doesn't support availability zones for the destination yet. Tagging the co-authors: @ScheererJ @rrhubenov
Please refer to the CONTRIBUTING.md file for more information on how to contribute to this project.
@microsoft-github-policy-service agree company="SAP SE"
Done! Thanks for re-running the CI for all the force pushes, we had issues running some of the makefile targets locally. I also can't figure out why the Windows image builds fail - any pointers? I can't really make sense of the 503 and EOFs.
@rickardsjp MCR is acting up for windows 2019 images, but it should be transient
This PR will be closed in 7 days due to inactivity.
Pull request closed due to inactivity.
Any chance to recover this PR?
Hi @rickardsjp, would you like to continue working on this PR?
Hi @SRodi, yep! Didn't mean to abandon it, just sort of ran out of capacity. Will try to pick it up again in the coming weeks.
There were some changes to the protobuf definitions, so I rebased the PR for starters. I'll try and take another swing at the PR feedback in the coming weeks.
Hi @SRodi,
can we have test coverage for negative/edge cases, like missing or malformed zones?
Malformed zones are sort of unlikely, the zone label on k8s nodes is a string and at most would restrict the length. Could you specify where you would like to see additional tests? I.e. in the controller test, the metrics test, or the types tests?
is there any backward compatibility consideration and potential impact on downstream consumers for RetinaEndpoint? (I do not think so, but have you considered/tested that?)
Does this mean the custom resource RetinaEndpoint or the internal RetinaEndpoint struct? The CR is unchanged and we do not foresee issues with the internal type.
shall we make the default zone "unknown" a configurable value? and maybe add logging for missing zones?
Configurable value: "unknown" is not a configurable value for other fields, see for example here. Logging: Added some debug level logs for empty zone fields.