cassandra-operator Refactor construction of Cassandra YAML overrides ConfigMap

Refactor construction of Cassandra YAML overrides ConfigMap

Open johananl opened this issue 4 years ago • 5 comments

The current implementation of

https://github.com/instaclustr/cassandra-operator/blob/1248f73f95e937010203f9eb74f7180641c550c3/pkg/controller/cassandradatacenter/configmap.go#L22

is in my opinion unnecessarily complex and could use some improvement. The way the function calls are structured makes it difficult to follow the code and doesn't allow easy testing.

Let's look at the following flow for example:

https://github.com/instaclustr/cassandra-operator/blob/1248f73f95e937010203f9eb74f7180641c550c3/pkg/controller/cassandradatacenter/configmap.go#L30-L34

The addFileFn function is defined inside the parent function.
The addCassandraYamlOverrides is called and addFileFn is passed as an argument to it.
When addFileFn is invoked inside addCassandraYamlOverrides, it in turn calls configMapVolumeAddTextFile which receives a pointer to a ConfigMap and manipulates the original struct.

A simpler, clearer alternative in my opinion would be something similar to the following (I've omitted irrelevant parts of the code on purpose):

// cassandraYamlOverrides returns a YAML string.
o, err := cassandraYamlOverrides(rctx.cdc, seedNodesService)
if err != nil {
    // handle error
}

// Replace special chars with a '_'.
k := sanitizePath("cassandra.yaml.d/001-operator-overrides.yaml")

configMap.Data[key] = o
...

Taking this approach should produce simpler code that is easier to follow and write unit tests for.

Aug 30 '19 14:08 johananl

Hi @johananl , feel free to prepare a PR with your ideas fully implemented, we do not have any problem to merge changes if they result in objectively better solutions. We are glad you are hacking around the code!

Aug 30 '19 15:08 smiklosovic

Part of the complexity stems from needing to update both the ConfigMap and ConfigMapVolumeSource and keep them in sync for every file we want to add to the config. configMapVolumeAddTextFile(...) handles this appropriately.

Hence your example would look more like this:

// cassandraYamlOverrides returns a YAML string.
o, err := cassandraYamlOverrides(rctx.cdc, seedNodesService)
if err != nil {
    // handle error
}

// Replace special chars with a '_'.
path = "cassandra.yaml.d/001-operator-overrides.yaml"
sanitizedPath := sanitizePath(path)

configMap.Data[sanitizedPath] = o
volumeSource.Items = append(volumeSource.Items, corev1.KeyToPath{Key: sanitizedPath, Path: path})
...

Which is basically inlining configMapVolumeAddTextFile(...) into the caller, which would get verbose quite quickly.

In addition, currently the functions that create the files (addCassandraYamlOverrides(...), addCassandraJVMOptions(...), etc) are responsible for naming said files. Your suggested approach defers the naming to the caller. The pattern also exists so that a function can add more than one file to the config directory, if needed. We used to do this, but the functions were split up -- so this ability is more of a hold-over, I do note.

Happy to refactor this a bit, but not too much.

Sep 06 '19 22:09 zegelin

@zegelin I definitely understand the rationale behind encapsulating logic and reusing it to keep things DRY. My main point was that in my opinion it is currently tricky to follow the flow inside createOrUpdateOperatorConfigMap() and I was trying to suggest ways to simplify this part of the operator.

Regarding keeping the ConfigMap and ConfigMapVolumeSource in sync - it is possible to avoid having to explicitly mention each individual file in the volume source by creating a ConfigMap per config directory and letting k8s mount an entire ConfigMap at the specified path inside the container, which it does by default unless you specify the items field in the ConfigMapVolumeSource:

https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.14/#configmapvolumesource-v1-core

If unspecified, each key-value pair in the Data field of the referenced ConfigMap will be projected into the volume as a file whose name is the key and content is the value. If specified, the listed keys will be projected into the specified paths, and unlisted keys will not be present.

For example, instead of doing

volumes:
- configMap:
    items:  
    - key: cassandra_yaml_d_001_operator_overrides_yaml
      path: cassandra.yaml.d/001-operator-overrides.yaml
    - key: jvm_options_d_001_jvm_memory_gc_options
      path: jvm.options.d/001-jvm-memory-gc.options
    name: cassandra-test-dc-cassandra-operator-config
  name: operator-config-volume

we could have one ConfigMap for all the files under cassandra.yaml.d/ and another for all the files under jvm.options.d/, which would allow us to omit the items section in the volume.

In addition, it's possible that a lot of the current complexity could be avoided by using Go templates for ConfigMap contents instead of constructing things like YAML overrides and JVM flags using Go primitives and then marshaling. Example:

package main

import (
	"os"
	"text/template"
)

type Config struct {
	ThisValue string
	ThatValue string
}

const c = `---
myConfig:
  someItems:
  - name: this
    data: {{.ThisValue}}
  - name: that
    data: {{.ThatValue}}`

func main() {
	config := Config{
		ThisValue: "stuff",
		ThatValue: "moreStuff",
	}

	t := template.Must(template.New("config").Parse(c))

	err := t.Execute(os.Stdout, config)
	if err != nil {
		panic(err)
	}

}

The above prints the following when run:

---
myConfig:
  someItems:
  - name: this
    data: stuff
  - name: that
    data: moreStuff

To conclude, my aim here was to suggest an improvement to an existing part of the code. Please feel free to close this if you think we shouldn't look into this further.

Sep 11 '19 11:09 johananl

love the templates!

Sep 11 '19 11:09 smiklosovic

@johananl there's a small problem we have for mounting the configVolume under cassandra.yaml.d and that is the ability of users to drop additional yaml fragments as part of C* configuration. We currently do this by allowing users to provide their own ConfigMapVolumeSource; having now 2 configMaps that need to end up in one place makes the things a bit more complicated. We solve it by mounting them separately in /tmp folder and then having C* entry point script going over these mounted directories and copying them over into appropriate locations inside /etc/cassandra.

If there's a better alternative to the flow we're using here, we might want to redesign the whole thing, but as it stands it's hard to say which code is easier to understand. I admit that it took me some time to figure out the flow of the createOrUpdateOperatorConfigMap(), but now it doesn't seem that complicated anymore (ymmv) :-)

Sep 13 '19 12:09 alourie

cassandra-operator cassandra-operator copied to clipboard

Refactor construction of Cassandra YAML overrides ConfigMap

cassandra-operator
cassandra-operator copied to clipboard