yocto-gl icon indicating copy to clipboard operation
yocto-gl copied to clipboard

[ MLflow Roadmap ] Environment Variable consolidation and migration

Open BenWilson2 opened this issue 1 year ago • 6 comments

Summary

There are a number of legacy references to directly defined environment variable keys within modules in MLflow. We would like to continue the effort of relocating all of these references to an easily referenced, clearly documented, and centralized location: mlflow.environment_variables.py.

In the process of doing the relocation of each of these, please adhere to the existing documentation standards within the target destination environment_variables.py, being as verbose as necessary in order to clearly communicate what the environment variable is for and how it should be used, what the available options are (if applicable), and what core MLflow functionality is affected by setting this environment variable.

If the environment variable declaration is already present in environment_variables.py, then simply replace the reference to the key in the declared file below with the environment variable handler reference (the getter).

Note: Internal environment variables, such as those defined here: https://github.com/mlflow/mlflow/blob/27f62b02ce7246363d95dcd1b224569efe680f4e/mlflow/server/init.py#L21-L27 do not to be included in this migration of variable definitions. These are set and retrieved by internal MLflow processes and should not be exposed as part of the environment_variables.py.

  • [ ] mlflow.projects.backend.local MLFLOW_S3_ENDPOINT_URL, MLFLOW_S3_IGNORE_TLS
  • [ ] mlflow.store.artifact.azure_blob_artifact_repo AZURE_STORAGE_CONNECTION_STRING
  • [ ] mlflow.metrics.metrics_definition TIKTOKEN_CACHE_DIR
  • [ ] mlflow.store.artifact.s3_artifact_repo MLFLOW_EXPERIMENTAL_S3_SIGNATURE_VERSION
  • [ ] mlflow.utils._capture_transformers_modules USE_TORCH, USE_TF
  • [ ] mlflow.utils.credentials DATABRICKS_CONFIG_FILE, DATABRICKS_CONFIG_PROFILE
  • [ ] mlflow.projects.kubernetes KUBE_MLFLOW_TRACKING_URI
  • [ ] mlflow.utils.databricks_utils DATABRICKS_RUNTIME_VERSION
  • [ ] mlflow.pyfunc.backend, mlflow.pyfunc.init MLFLOW_HOME
  • [ ] mlflow.utils._spark_utils SPARK_DIST_CLASSPATH, PYSPARK_GATEWAY_PORT, PYSPARK_GATEWAY_SECRET

Notes

  • Make sure to open a PR from a non-master branch.

  • Sign off the commit using the -s flag when making a commit:

    git commit -s -m "..."
             # ^^ make sure to use this
    
  • Include #{issue_number} (e.g. #123) in the PR description when opening a PR.

BenWilson2 avatar Oct 20 '23 16:10 BenWilson2

@BenWilson2, may I get this one please? Thanks!

mkrdip avatar Oct 25 '23 11:10 mkrdip

@mkrdip You sure can! Thank you for volunteering!

BenWilson2 avatar Oct 25 '23 13:10 BenWilson2

@mlflow/mlflow-team Please assign a maintainer and start triaging this issue.

github-actions[bot] avatar Oct 28 '23 00:10 github-actions[bot]

Hey @mkrdip, could you please let me know if you are still willing to work on this? I would like to take it up if you haven't already made some progress.

sh4x2 avatar Dec 19 '23 20:12 sh4x2

@sh4x2, I am working on this during this holiday period. Thanks for checking in. I'll be done this week and submit my PR.

mkrdip avatar Dec 26 '23 23:12 mkrdip

Hi @mkrdip, what's the status of this, were you able to make any progress?

JMLizano avatar Apr 11 '24 16:04 JMLizano

Is there any progress in this? I would like to contribute

rahuja23 avatar Jul 25 '24 20:07 rahuja23

Hi @JMLizano @BenWilson2 there seems to be not much progress, so I would contribute a PR for this issue, ok ?

bmerkle avatar Sep 20 '24 12:09 bmerkle

sure, go ahead!

JMLizano avatar Sep 20 '24 14:09 JMLizano