tfx
tfx copied to clipboard
How to set environment variables for pipeline stages
I'm trying to figure out how to set environment variables when creating a pipeline, the use case here is that I want to use S3 and I need to push down the credentials to tensorflow-io
via environment variables, in previous releases of TFX when using reusable kubeflow components, I could simply add the environment variables, but if I try to use the built-in components and then compile them via DSL KubeflowDagRunnerConfig
to a package, I don't see how I can set environment variables for individual stages.
Is the only option to wrap TFX native components in function components?
Thank you for the question and it seems like a duplicate of https://github.com/tensorflow/tfx/issues/3326. I believe that @ConverJens is working on this in https://github.com/tensorflow/tfx/pull/4861.
@dvaldivia I don't think it's possible to set any k8s value (resources, env vars, secrets etc) on individual steps but you can set them on the pipeline level using the
kubeflow_dag_runner.KubeflowDagRunnerConfig(
pipeline_operator_funcs=([your_func_to_set_env_vars_using_k8s_api(env_var_name, env_var_val)]
)
See my answer on this issue: https://github.com/tensorflow/tfx/issues/3194
cc @jiyongjung0
@dvaldivia,
I see a PR #4861 merged which addresses this issue by enabling env. variables in beam_args through placeholders. This will be part of upcoming release. Please try the latest nightly build of TFX and let us know if this resolved your issue. Thank you!