Implement lookup controller
Fixes #139.
Description
I'd like to be able to provision and manage my druid clusters with GitOps techniques. This druid operator allows me to declaratively configure the nodes of a druid cluster. Another part is being able do declaratively specify ingestions, which others are working on. This PR brings another piece of the puzzle by allowing users to declare druid lookups as K8s resources.
This PR implements a reconciliation strategy of getting a comprehensive understanding of both current and desired lookups, calculating a diff and based on that diff taking corrective actions. This provides a simple and resilient reconciliation loop. Another strategy that were explored during the development of this PR were to only reconcile one druid lookup K8s resource at time as events are received by the controller. While this approach could prove more efficient due to not requesting the complete state in every iteration, it proved to require fragile bookkeeping to handle certain cases, e.g. editing a lookup K8s resource to rename a lookup.
This PR has:
- [ ] been tested on a real K8S cluster to ensure creation of a brand new Druid cluster works.
- [ ] been tested for backward compatibility on a real K8S cluster by applying the changes introduced here on an existing Druid cluster. If there are any backward incompatible changes then they have been noted in the PR description.
- [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
- [ ] added documentation for new or modified features or behaviors.
Key changed/added files in this PR
-
controllers/lookup/*
@MrLarssonJr Thanks a lot for working on this PR. I will review it over weekend. BTW on the first glance, is lookup updates working ? if i update the lookup cr will it update the lookup ? Also can we construct status.
Happy to help @AdheipSingh!
I must admit I've only tested in a kind cluster, but in there updates work!
Sure, I'll take a look at integrating status.
I also saw the notification that some test failed, I'll look into that as well.
@AdheipSingh, I just wanted to check if you've had some time to look over my new commits?
@itamar-marom @cyril-corbon if you can help with the reviews.
Hi @itamar-marom, thanks for your feedback! I'll incorporate it and come back with some new commits.
I've marked conversations I feel like I've addressed as resolved to help me keep track of what items I've still got to address. @itamar-marom, if you feel anyone of the conversations I marked as resolved, in fact are not resolved, please do not hesitate to re-open them!
We're getting close to closing all code-related issues! I wanted to ask you for one more thing - we need tests. Unit testing for relevant functions (It's easy to do it with ChatGPT) and if you could add a use case to our e2e test it would be great. If you need help with this please ping me and I'll help 🙏🏽
Hi @itamar-marom! My turn to say sorry for a long delay in replying, thank you for your patience 🙏 Some time has finally opened up and I'm starting to dig into your latest comments. I expect I'll have some new commits in the coming days addressing them!
Alright, became a bit of rewrite once more. Highlights:
- Using bookkeeping method.
- Utilising the Druid lookup k8s resource name as the lookup id.
- Useful to ensure there only ever are one resource controlling a lookup.
- Means there can only be one lookup named one thing per namespace.
- Thoughts?
- Added
- simple e2e test.
- unit tests.
- Using
LocalObjectReferncenow. - Fields in
DruidLookupSpecandDruidLookupStatusare now documented.
(I tried to fix the e2e test failing due to a name for a minio tenant pool were missing. It did get further, but there still seems to be something wrong. I've had issues running it locally also due to arm64/amd64 issues inside of kind.)
@AdheipSingh code looks good. I'm not sure why the e2e fails. Can you help troubleshoot this?