trino-gateway
trino-gateway copied to clipboard
Separate loading of application configurations and secrets into more than one file
The goal of this request is to have the loading of basic configurations and secrets split up to provide a separation of concerns.
An example usage where this would be required would be running Trino Gateway in Kubernetes using the provided Helm chart. Currently, all of the configurations/secrets are merged and mounted to the container as a single YAML file. In order to follow this strategy that means:
- In a secured K8s cluster, you are unable to introspect the configuration being mounted to the pod unless you have elevated permissions to view Secret objects. Usually, ConfigMap resources are R/O to majority of users.
- Your deployment strategy requires elevated permissions to view Secret objects so that a
lookupcan be done on pre-existing secrets which will be merged into the standalone secret- A widely used deployment tool ArgoCD does not support the
lookupstrategy.
- A widely used deployment tool ArgoCD does not support the
Opening this issue to spur conversations on better strategies for configuring the application.
An initial suggestion could be leveraging Jackson's ObjectMapper with YAMLFactory and the readerForUpdating functionality. This could support loading of multiple configurations by the following strategy:
- Update the application to allow > 1 configuration file paths through arguments.
- Leverage Jackson's ObjectMapper with YAMLFactory and initialize/update the application config using the mapper's readerForUpdating method.
I'm not 100% sure how Trino is handling this in their Kubernetes deployment but it would be interesting to see if we can follow suit.
We currently have it all in one to keep it simple. Once the airlift refactor is in we plan to work with Trino and airlift projects to figure out good approach. Our need is kinda different since we use yaml for nested configs quite a bit and Trino/airlift currently dont (yet). So maybe we do something like
- config.properties like in trino for global config
- log.properties for logging package config
- jvm.config for JVM start configuration
- maybe node.properties but ideally not since all TGW nodes should be the same and stateless
- some way to get more yaml config in
- some better way for secrets .. similar to how in trino we can use env.* and maybe more stuff
We can discuss more in the next sync for starters. Also fyi and for sanity check @willmostly @Chaho12 @vishalya and @oneonestar
I think env variable is a good way.
We can use env: secretKeyRef to set env. variable from secret in K8S YAML.
After migrate to Airlift, settings in httpConfig section are allowed to use env variable (impl. by Airlift).
We might want to add the same function to other sections.
httpConfig:
http-server.https.keystore.path: cert.jks
http-server.https.keystore.key: ${ENV:KEYSTORE_KEY}
Also, the plain text password in presetUsers section should be gone.
We could use something similar to password file in Trino.
If we decide to pull in io.trino.spi.*, we could reuse FileAuthenticator from Trino.
To revisit this topic now that Airlift is in, would the easiest path forward be to leverage the env variables as suggested by star?
My preference would be volume mounted files as filesystem ACLs can limit access to the secret, whereas any user with R/O access to the container would be able to output the environment variables. But it feels like environment variables would be free?
whereas any user with R/O access to the container would be able to output the environment variables.
Is it common to allow several users to access container?
With #483 secrets can be set using ${ENV:PASSWORD}. I think it's common practice to pass secrets using environment variables, which is supported by K8S by default. https://kubernetes.io/docs/tasks/inject-data-application/distribute-credentials-secure/#define-container-environment-variables-using-secret-data
whereas any user with R/O access to the container would be able to output the environment variables.
Is it common to allow several users to access container?
Depends on the org and environment I suppose, as an example we lock down our production and pre-production type environments to ensure no secrets are accessible. For some levels of folks we permit R/O access to both servers themselves and pods but the latter is less frequent and on a case by case basis (and environment by environment). If not levering the file ACLs these people would be able to grab the secrets by printing the env.
Or put differently, random unexpected access by a user has 1 more hurdle to jump prior to having secret access.
With #483 secrets can be set using
${ENV:PASSWORD}. I think it's common practice to pass secrets using environment variables, which is supported by K8S by default. https://kubernetes.io/docs/tasks/inject-data-application/distribute-credentials-secure/#define-container-environment-variables-using-secret-data
Agreed that it’s the simplest path forward and any security improvements could be a separate ask. This way there becomes alignment with Trino as well
This can likely be closed, correct? I see the Helm chart no longer uses lookup and also supports env vars for secrets.
Though I see the configuration is still being mounted as a Secret instead of ConfigMap, is there a reason behind it?