guardian-for-apache-kafka icon indicating copy to clipboard operation
guardian-for-apache-kafka copied to clipboard

Discuss how to treat environment variables wrt naming

Open mdedetrich opened this issue 2 years ago • 0 comments

What is currently missing?

Guardian can be configured using environment variables which is documented in the various reference.conf files that can be found in the modules (note that actual documentation for this needs to be done and will be added later).

Currently the strategy for the naming of these environment variables was to reflect the reference.conf setting which is a feature of how Java/Scala libraries are configured. However due to how Apache Kafka Client/Alpakka/S3 designed, the naming and structure for these reference.conf settings are all over the place. As a quick example, to configure the Kafka bootstrap servers you would do

KAFKA_CLIENT_BOOTSTRAP_SERVERS="localhost:9092"

but for a topic you would do

AKKA_KAFKA_CONSUMER_POLL_INTERVAL=1 minute

This is due to the fact that the former is a setting for Apache's official Java Kafka Client where as the second is a setting for Alpakka's Kafka client that happens to wrap Apache's Java Kafka Client.

Furthermore for some settings which are passed directly into Apache's Kafka such as bootstrap servers you configure this by using comma delimited single value, i.e.

KAFKA_CLIENT_BOOTSTRAP_SERVERS="localhost:9092,localhost:9093"

but kafka topics config allows you to individually specify environment variables using the dot prefix, i.e.

KAFKA_CLUSTER_TOPICS.0=first_topic
KAFKA_CLUSTER_TOPICS.1=second_topic

This is because typesafe config allows you to configure list based configurations directly by index using . (which is valid under POSIX for environment variables) but Apach'e Kafka client decided to configure this differently.

Hence the conclusion can be made because of largely technical/language reasons the environment configuration is all over the place. Or to put differently, the configuration settings are reflecting the idiosyncracies of the Java/Scala language along with how all of the libraries/ecosystems work rather than the treating the implementation/language of guardian as a black box and having consistent/clear way to configure it.

How could this be improved?

There is a strong argument to be made that rather than using environment variables that are consistent to how Java/Scala reference.conf/java properties libraries are configured (which ultimately ends up leading to confusion) we should instead strive to have completely consistent environment variables for the whole app at the cost of possibly confusing a Java/Scala developer that may want to use Guardian directly as an app in a non typical way (rather than as a CLI or a docker tool).

Is this a feature you would work on yourself?

  • [ ] I plan to open a pull request for this feature

mdedetrich avatar Jan 25 '22 14:01 mdedetrich