datahub-helm
datahub-helm copied to clipboard
Support ingestion recipes from secrets
Is your feature request related to a problem? Please describe. The problem is that ingestion recipes are currently loaded from ConfigMaps even though they often contain sensitive values. This makes life a bit harder to manage these files via code.
Describe the solution you'd like I'd like DataHub's Helm Chart to also support the usage of secrets as source of the ingestion recipe.
Describe alternatives you've considered
I suppose I could use an init container with an env var and then rewrite the file mounted by the ConfigMap with envsubst
. I am also aware that DataHub provides its own secrets management solution but this is not the best approach for us as we have another secrets provider in place and we load secrets from there via external secrets.
Additional context n/a
Notes I am interested in developing the feature myself. I just wait for some position from the maintainers regarding this proposal.
This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io
This issue was closed because it has been inactive for 30 days since being marked as stale.
I see no issue with supporting this configuration using secrets, if you'd like to contribute this I think we would be open to accepting it :)
Great! I'll work on it and let you know later!
This makes life a bit harder to manage these files via code.
Hi Thx for the contribution. For me preferable download secrets from the hashicorp vault directly. Would you like to create such functionality?
download secrets from the hashicorp vault directly
@alplatonov Vault supports annotations to directly set the secret values as env vars in the container. Is that what you want? e.g. https://stackoverflow.com/questions/61239479/injecting-vault-secrets-into-kubernetes-pod-environment-variable
In my case, I have API access enabled and therefore every ingestion needs a token. I store the token in AWS Secrets Manager and mount them as Kubernetes secrets via ExternalSecret to datahub-secrets
. So I did the following workaround. I used an initContainer to run envsubst
and copy the replaced ingestion values to the actual ingestion container:
datahub-ingestion-cron:
enabled: true
image:
repository: acryldata/datahub-ingestion
tag: "v0.11.0.2"
crons:
dbt:
# Daily, at 4:30am
schedule: "30 4 * * *"
recipe:
configmapName: ingestion-dbt
fileName: pipeline.yml
serviceAccountName: dbt-irsa
command: |
datahub ingest -c /etc/ingestion/dbt.yml
extraVolumes:
- name: ingestion
emptyDir: {}
extraVolumeMounts:
- name: ingestion
mountPath: /etc/ingestion
extraInitContainers:
- name: recipe-rewriter
image: bhgedigital/envsubst:v1.0-alpine3.6
env:
- name: "INGESTION_TOKEN"
valueFrom:
secretKeyRef:
name: "datahub-secrets"
key: "ingestion_token"
command: ['sh', '-c', 'envsubst < /etc/recipes/pipeline.yml > /etc/ingestion/dbt.yml']
volumeMounts:
- name: recipe
mountPath: /etc/recipes
- name: ingestion
mountPath: /etc/ingestion
The actual ingestion (commited to git) looks like:
apiVersion: v1
kind: ConfigMap
metadata:
name: ingestion-dbt
data:
pipeline.yml: |
pipeline_name: dbt
sink:
type: datahub-rest
config:
server: http://datahub-datahub-gms:8080
token: ${INGESTION_TOKEN}
source:
type: dbt
config:
aws_connection:
...
My idea of feature is to support this via Helm Chart i.e. an extraInitContainer
pulling envsubst
and rewriting the recipe for the ingestion workload. So, instead of setting the env.[*].valueFrom
in the initContainer, you would be able to just annotate e.g.
vault.hashicorp.com/agent-inject-template-config: |
{{ with secret "secret/data/mysecret" -}}
export MY_SECRET="{{ .Data.data.MY_SECRET.MY_SECRET_KEY }}"
{{- end }}
in the init container and then envsubst
would replace the sensitive value for you in runtime before DataHub tries to fetch its internal secrets from its own store. Wdyt?
Feel free to copy my workaround for yourself while this is not officially supported <3
This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io
don't stale it yet
This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io
how i miss having time for this =(
This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io