koperator
koperator copied to clipboard
Add ability to configure cluster wide kafka configurations when using multiple broker groups
Is your feature request related to a problem? Please describe.
Currently kafka-operator supports rack-awareness in the sense that the broker rack is inferred from the underlying node failure domain. It would be useful to also support ability to span brokers in different racks defined upfront.
This can be achieved now by using different brokerConfigGroups
, one for each AZ, in combination with nodeAffinity
but it has the drawback that the whole kafka broker configs needs to be repeated for each broker group.
Hence I am proposing we add a new defaultBrokerConfig
field of type BrokerConfig
at the cluster.spec level that can be used to define common values reused across all broker groups.
Describe the solution you'd like to see
defaultBrokerConfig:
kafkaHeapOpts: ...
kafkaJvmPerfOpts: ....
storageConfigs: ...
....
brokerConfigGroups:
rack1:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: failure-domain.beta.kubernetes.io/zone
operator: In
values:
- us-east1
rack2:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: failure-domain.beta.kubernetes.io/zone
operator: In
values:
- us-east2
rack3:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: failure-domain.beta.kubernetes.io/zone
operator: In
values:
- us-east3
Describe alternatives you've considered Using different brokerGroups but I have to repeat all kafka broker configuration
Renamed the issue title to better capture the feature request
@baluchicken @stoader what's your thought on adding a new field in KafkaCluster CRD to define clusterDefaultBrokerConfig ? Something similar to clusterImage but for all broker configs?
@amuraru what do you think about relying on Yaml anchors and aliases (https://www.educative.io/blog/advanced-yaml-syntax-cheatsheet#anchors) for composing parts of the yaml from existing parts (to be more in line with yaml standards)?
Something like:
brokerConfigGroups:
default: &default-config-group
storageConfigs: ...
...
rack1:
parent: *default-config-group
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: failure-domain.beta.kubernetes.io/zone
operator: In
values:
- us-east1
rack2:
parent: *default-config-group
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: failure-domain.beta.kubernetes.io/zone
operator: In
values:
- us-east2
cc @pregnor @hi-im-aren @bartam1 @panyuenlau @amberimam @Kuvesz
Haven't heard about this anchor before but that's pretty cool. 👍
It also works with our existing YAML handling libraries it seems, so I would support implementing it this way.
yaml anchors sounds like a neat idea. Do you know if kubectl apply understands though for example?
Based on my quick search and manual test:
Anchors are a YAML 1.0 feature. There is a blog post about K8s supporting YAML anchors in resource descriptors, but according to community, there is a limitation for only supporting in-file anchors, not cross-file ones.
I did a quick manual test on an EKS 1.23 cluster of mine and I can confirm, at least in-file YAML anchors are supported by K8s (I don't think the K8s version is relevant due to the information following). However it is important to note when saving resource descriptors, K8s stores the expanded version of the YAML, not the original one. Because of this, K8s probably doesn't explicitly handle it, it rather lets its YAML parsers deal with parsing and expanding at creation/submission time.
# Manually written yaml_anchor_test_configmap.yaml.
apiVersion: v1
data:
test1: &content |
this is a test
test2: *content
test3: *content
kind: ConfigMap
metadata:
name: test
kubectl apply -f ~/yaml_anchor_test_configmap.yaml
# kubectl get configmap test -o yaml
apiVersion: v1
data:
test1: |
this is a test
test2: |
this is a test
test3: |
this is a test
kind: ConfigMap
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","data":{"test1":"this is a test\n","test2":"this is a test\n","test3":"this is a test\n"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"test","namespace":"default"}}
creationTimestamp: "2022-09-06T09:18:02Z"
name: test
namespace: default
resourceVersion: "591146"
uid: 7aabfd63-5af5-40e1-9f2e-27712949e77f
Also I expect this to be a supported feature for a long time because a, K8s repo/release notes don't mention support for YAML anchors explicitly, and b, related YAML processing libraries such as go-yaml/yaml (and thus sigs.k8s.io/yaml transitively) are supporting this feature for a long time (5+ years based on past issues). go-yaml/yaml issue from 2015, sigs.k8s.io/yaml using go-yaml/yaml.
Edit: Also seems to work with structured data, not just scalars (I expect it would work for deeply structured data as well, I just don't have a CRD at hand ATM).
apiVersion: v1
data: &structured
test1: "1"
test2: "2"
test3: "3"
kind: ConfigMap
metadata:
name: test
annotations: *structured
->
apiVersion: v1
data:
test1: "1"
test2: "2"
test3: "3"
kind: ConfigMap
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","data":{"test1":"1","test2":"2","test3":"3"},"kind":"ConfigMap","metadata":{"annotations":{"test1":"1","test2":"2","test3":"3"},"name":"test","namespace":"default"}}
test1: "1"
test2: "2"
test3: "3"
creationTimestamp: "2022-09-06T09:51:31Z"
name: test
namespace: default
resourceVersion: "608257"
uid: 6e0f9c8b-f449-403e-8fee-085d5eaf7e49