cassandra-operator
cassandra-operator copied to clipboard
Refactor construction of Cassandra YAML overrides ConfigMap
The current implementation of
https://github.com/instaclustr/cassandra-operator/blob/1248f73f95e937010203f9eb74f7180641c550c3/pkg/controller/cassandradatacenter/configmap.go#L22
is in my opinion unnecessarily complex and could use some improvement. The way the function calls are structured makes it difficult to follow the code and doesn't allow easy testing.
Let's look at the following flow for example:
https://github.com/instaclustr/cassandra-operator/blob/1248f73f95e937010203f9eb74f7180641c550c3/pkg/controller/cassandradatacenter/configmap.go#L30-L34
- The
addFileFn
function is defined inside the parent function. - The
addCassandraYamlOverrides
is called andaddFileFn
is passed as an argument to it. - When
addFileFn
is invoked insideaddCassandraYamlOverrides
, it in turn callsconfigMapVolumeAddTextFile
which receives a pointer to a ConfigMap and manipulates the original struct.
A simpler, clearer alternative in my opinion would be something similar to the following (I've omitted irrelevant parts of the code on purpose):
// cassandraYamlOverrides returns a YAML string.
o, err := cassandraYamlOverrides(rctx.cdc, seedNodesService)
if err != nil {
// handle error
}
// Replace special chars with a '_'.
k := sanitizePath("cassandra.yaml.d/001-operator-overrides.yaml")
configMap.Data[key] = o
...
Taking this approach should produce simpler code that is easier to follow and write unit tests for.
Hi @johananl , feel free to prepare a PR with your ideas fully implemented, we do not have any problem to merge changes if they result in objectively better solutions. We are glad you are hacking around the code!
Part of the complexity stems from needing to update both the ConfigMap
and ConfigMapVolumeSource
and keep them in sync for every file we want to add to the config. configMapVolumeAddTextFile(...)
handles this appropriately.
Hence your example would look more like this:
// cassandraYamlOverrides returns a YAML string.
o, err := cassandraYamlOverrides(rctx.cdc, seedNodesService)
if err != nil {
// handle error
}
// Replace special chars with a '_'.
path = "cassandra.yaml.d/001-operator-overrides.yaml"
sanitizedPath := sanitizePath(path)
configMap.Data[sanitizedPath] = o
volumeSource.Items = append(volumeSource.Items, corev1.KeyToPath{Key: sanitizedPath, Path: path})
...
Which is basically inlining configMapVolumeAddTextFile(...)
into the caller, which would get verbose quite quickly.
In addition, currently the functions that create the files (addCassandraYamlOverrides(...)
, addCassandraJVMOptions(...)
, etc) are responsible for naming said files. Your suggested approach defers the naming to the caller. The pattern also exists so that a function can add more than one file to the config directory, if needed. We used to do this, but the functions were split up -- so this ability is more of a hold-over, I do note.
Happy to refactor this a bit, but not too much.
@zegelin I definitely understand the rationale behind encapsulating logic and reusing it to keep things DRY. My main point was that in my opinion it is currently tricky to follow the flow inside createOrUpdateOperatorConfigMap()
and I was trying to suggest ways to simplify this part of the operator.
Regarding keeping the ConfigMap and ConfigMapVolumeSource in sync - it is possible to avoid having to explicitly mention each individual file in the volume source by creating a ConfigMap per config directory and letting k8s mount an entire ConfigMap at the specified path inside the container, which it does by default unless you specify the items
field in the ConfigMapVolumeSource:
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.14/#configmapvolumesource-v1-core
If unspecified, each key-value pair in the Data field of the referenced ConfigMap will be projected into the volume as a file whose name is the key and content is the value. If specified, the listed keys will be projected into the specified paths, and unlisted keys will not be present.
For example, instead of doing
volumes:
- configMap:
items:
- key: cassandra_yaml_d_001_operator_overrides_yaml
path: cassandra.yaml.d/001-operator-overrides.yaml
- key: jvm_options_d_001_jvm_memory_gc_options
path: jvm.options.d/001-jvm-memory-gc.options
name: cassandra-test-dc-cassandra-operator-config
name: operator-config-volume
we could have one ConfigMap for all the files under cassandra.yaml.d/
and another for all the files under jvm.options.d/
, which would allow us to omit the items
section in the volume.
In addition, it's possible that a lot of the current complexity could be avoided by using Go templates for ConfigMap contents instead of constructing things like YAML overrides and JVM flags using Go primitives and then marshaling. Example:
package main
import (
"os"
"text/template"
)
type Config struct {
ThisValue string
ThatValue string
}
const c = `---
myConfig:
someItems:
- name: this
data: {{.ThisValue}}
- name: that
data: {{.ThatValue}}`
func main() {
config := Config{
ThisValue: "stuff",
ThatValue: "moreStuff",
}
t := template.Must(template.New("config").Parse(c))
err := t.Execute(os.Stdout, config)
if err != nil {
panic(err)
}
}
The above prints the following when run:
---
myConfig:
someItems:
- name: this
data: stuff
- name: that
data: moreStuff
To conclude, my aim here was to suggest an improvement to an existing part of the code. Please feel free to close this if you think we shouldn't look into this further.
love the templates!
@johananl there's a small problem we have for mounting the configVolume under cassandra.yaml.d and that is the ability of users to drop additional yaml fragments as part of C* configuration. We currently do this by allowing users to provide their own ConfigMapVolumeSource; having now 2 configMaps that need to end up in one place makes the things a bit more complicated. We solve it by mounting them separately in /tmp folder and then having C* entry point script going over these mounted directories and copying them over into appropriate locations inside /etc/cassandra.
If there's a better alternative to the flow we're using here, we might want to redesign the whole thing, but as it stands it's hard to say which code is easier to understand. I admit that it took me some time to figure out the flow of the createOrUpdateOperatorConfigMap()
, but now it doesn't seem that complicated anymore (ymmv) :-)