azure-sdk-for-python
azure-sdk-for-python copied to clipboard
AzureML wokspace environments build failing SDKV1
- Package Name: azureml-sdk
- Package Version: V1
- Operating System: Linux
- Python Version:3.10
The bug We have been building ML environments using docker files in Azure workspace using a CI pipeline. It used to work fine until today we tried to rebuild the environment with new conda dependencies.
The build job fails with a weird error, which seems to be an internal bug. Here is the screenshot of the job:
On examining logs, we see this:
The script above "script.py" is Azures's internal script. I believe it's missing an exists_ok=True in mkdir(name, mode). That's why it complains that the file is already present?
We use
az ml environment create --name "$azureEnvPrefix" --build-context $dockerContext --dockerfile-path Dockerfile --resource-group ${{parameters.resourceGroupName}} --workspace-name ${{parameters.workspaceName}} --tags "dev=$timestamp.CI" "ready_for=${{parameters.targetTag}}"
in a CI task to create envs.
Hi @obiii - Thanks for opening an issue. We'll take a look asap. cc/ @azureml-github
Hi @obiii , Could you please provide more details and reproduce steps to investigate the root cause.
Hi @obiii , Could you please provide more details and reproduce steps to investigate the root cause.
Hi, I assume you mean the docker files. Please let me know otherwise. The docker files are generated by a CI pipeline that I cannot share but here are the resultant files that the CI pipeline uses to build images,
Dockerfile
ARG CONDA_FILE
ARG IMAGE_NAME
FROM mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04
RUN echo "CONDA_FILE: conda_dependencies/preprocess_conda_dependencies.yml"
RUN echo "IMAGE_NAME: mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04"
COPY conda_dependencies/preprocess_conda_dependencies.yml conda_env.yml
RUN rm /bin/sh && ln -s /bin/bash /bin/sh
RUN echo "source /opt/miniconda/etc/profile.d/conda.sh && conda activate" >> ~/.bashrc
RUN cat conda_env.yml
RUN source /opt/miniconda/etc/profile.d/conda.sh && \
conda activate && \
conda install conda && \
pip install cmake && \
conda env update -f conda_env.yml
conda_dependencies/preprocess_conda_dependencies.yml
channels:
- anaconda
- conda-forge
- defaults
dependencies:
- python=3.11
- pip:
- numpy==1.23.5
- pandas==2.1.4
- prophet==1.1.5
- SQLAlchemy==2.0.23
- urllib3==2.1.0
- pyodbc==5.0.1
- mlflow==2.9.1
- azureml-mlflow==1.54.0.post1
- azureml-core==1.54.0.post1
- azureml-dataset-runtime==1.54.0.post1
The command inside the CI ppl to build the images:
az ml environment create --name "$azureEnvPrefix" --build-context $dockerContext --dockerfile-path Dockerfile --resource-group ${{parameters.resourceGroupName}} --workspace-name ${{parameters.workspaceName}}
You can provide values for the argument according to your setup. For us the file structure is as follows:
projectName/
ml_service/
docker/
Dockerfile
conda_dependencies/
preprocess_conda_dependencies.yml
dockerBaseDir="ml_service/docker" dockerContext="$(System.DefaultWorkingDirectory)/$dockerBaseDir"
Same problem here,
Any ideas?
Hi @isaudagar , is there any update on this please?
Hello @obiii, I have trying to run the above files and getting some errors. Can you please provide the conda.sh file details?
Hi @Junnu-akhila , Here is the conda.sh file.
export CONDA_EXE='/opt/miniconda/bin/conda'
export _CE_M=''
export _CE_CONDA=''
export CONDA_PYTHON_EXE='/opt/miniconda/bin/python'
# Copyright (C) 2012 Anaconda, Inc
# SPDX-License-Identifier: BSD-3-Clause
__conda_exe() (
"$CONDA_EXE" $_CE_M $_CE_CONDA "$@"
)
__conda_hashr() {
if [ -n "${ZSH_VERSION:+x}" ]; then
\rehash
elif [ -n "${POSH_VERSION:+x}" ]; then
: # pass
else
\hash -r
fi
}
__conda_activate() {
if [ -n "${CONDA_PS1_BACKUP:+x}" ]; then
# Handle transition from shell activated with conda <= 4.3 to a subsequent activation
# after conda updated to >= 4.4. See issue #6173.
PS1="$CONDA_PS1_BACKUP"
\unset CONDA_PS1_BACKUP
fi
\local ask_conda
ask_conda="$(PS1="${PS1:-}" __conda_exe shell.posix "$@")" || \return
\eval "$ask_conda"
__conda_hashr
}
__conda_reactivate() {
\local ask_conda
ask_conda="$(PS1="${PS1:-}" __conda_exe shell.posix reactivate)" || \return
\eval "$ask_conda"
__conda_hashr
}
conda() {
\local cmd="${1-__missing__}"
case "$cmd" in
activate|deactivate)
__conda_activate "$@"
;;
install|update|upgrade|remove|uninstall)
__conda_exe "$@" || \return
__conda_reactivate
;;
*)
__conda_exe "$@"
;;
esac
}
if [ -z "${CONDA_SHLVL+x}" ]; then
\export CONDA_SHLVL=0
# In dev-mode CONDA_EXE is python.exe and on Windows
# it is in a different relative location to condabin.
if [ -n "${_CE_CONDA:+x}" ] && [ -n "${WINDIR+x}" ]; then
PATH="$(\dirname "$CONDA_EXE")/condabin${PATH:+":${PATH}"}"
else
PATH="$(\dirname "$(\dirname "$CONDA_EXE")")/condabin${PATH:+":${PATH}"}"
fi
\export PATH
# We're not allowing PS1 to be unbound. It must at least be set.
# However, we're not exporting it, which can cause problems when starting a second shell
# via a first shell (i.e. starting zsh from bash).
if [ -z "${PS1+x}" ]; then
PS1=
fi
fi
```export CONDA_EXE='/opt/miniconda/bin/conda'
export _CE_M=''
export _CE_CONDA=''
export CONDA_PYTHON_EXE='/opt/miniconda/bin/python'
# Copyright (C) 2012 Anaconda, Inc
# SPDX-License-Identifier: BSD-3-Clause
__conda_exe() (
"$CONDA_EXE" $_CE_M $_CE_CONDA "$@"
)
__conda_hashr() {
if [ -n "${ZSH_VERSION:+x}" ]; then
\rehash
elif [ -n "${POSH_VERSION:+x}" ]; then
: # pass
else
\hash -r
fi
}
__conda_activate() {
if [ -n "${CONDA_PS1_BACKUP:+x}" ]; then
# Handle transition from shell activated with conda <= 4.3 to a subsequent activation
# after conda updated to >= 4.4. See issue #6173.
PS1="$CONDA_PS1_BACKUP"
\unset CONDA_PS1_BACKUP
fi
\local ask_conda
ask_conda="$(PS1="${PS1:-}" __conda_exe shell.posix "$@")" || \return
\eval "$ask_conda"
__conda_hashr
}
__conda_reactivate() {
\local ask_conda
ask_conda="$(PS1="${PS1:-}" __conda_exe shell.posix reactivate)" || \return
\eval "$ask_conda"
__conda_hashr
}
conda() {
\local cmd="${1-__missing__}"
case "$cmd" in
activate|deactivate)
__conda_activate "$@"
;;
install|update|upgrade|remove|uninstall)
__conda_exe "$@" || \return
__conda_reactivate
;;
*)
__conda_exe "$@"
;;
esac
}
if [ -z "${CONDA_SHLVL+x}" ]; then
\export CONDA_SHLVL=0
# In dev-mode CONDA_EXE is python.exe and on Windows
# it is in a different relative location to condabin.
if [ -n "${_CE_CONDA:+x}" ] && [ -n "${WINDIR+x}" ]; then
PATH="$(\dirname "$CONDA_EXE")/condabin${PATH:+":${PATH}"}"
else
PATH="$(\dirname "$(\dirname "$CONDA_EXE")")/condabin${PATH:+":${PATH}"}"
fi
\export PATH
# We're not allowing PS1 to be unbound. It must at least be set.
# However, we're not exporting it, which can cause problems when starting a second shell
# via a first shell (i.e. starting zsh from bash).
if [ -z "${PS1+x}" ]; then
PS1=
fi
fi
Hi @isaudagar , is there any update on this please?
Hi, in case you need the image build logs: build_log.txt
Please let me know if there is any updated. Thanks :)
Hi @obiii
The FileExistsError typically occurs in Python when you attempt to create a file or directory that already exists. However, looking at the command you provided, it seems that the error might not be directly related to file creation.
In your Dockerfile, you're using conda env update -f conda_env.yml command to update a Conda environment using a YAML file (conda_env.yml). This error might occur if one of the packages specified in conda_env.yml is already installed in the environment or if the environment itself already exists.
Here are a few things to check and troubleshoot: Check Environment Existence: Ensure that the Conda environment specified in the conda_env.yml file exists before attempting to update it. You can use conda env list to see the list of existing environments. Check Package Versions: If a package specified in conda_env.yml is already installed but with a different version, Conda might raise an error. Make sure the versions specified in the YAML file are compatible with the current environment. Clean Environment: If you're okay with removing the existing environment and recreating it from scratch, you can use conda env remove -n <environment_name> to remove the existing environment before running the update command. Check File Paths: Ensure that the conda_env.yml file is located in the correct directory and that the path is correctly specified in the RUN command. Permissions: Ensure that the user running the Dockerfile has the necessary permissions to create and modify Conda environments and install packages.
Hi @obiii
The FileExistsError typically occurs in Python when you attempt to create a file or directory that already exists. However, looking at the command you provided, it seems that the error might not be directly related to file creation.
In your Dockerfile, you're using conda env update -f conda_env.yml command to update a Conda environment using a YAML file (conda_env.yml). This error might occur if one of the packages specified in conda_env.yml is already installed in the environment or if the environment itself already exists.
Here are a few things to check and troubleshoot: Check Environment Existence: Ensure that the Conda environment specified in the conda_env.yml file exists before attempting to update it. You can use conda env list to see the list of existing environments. Check Package Versions: If a package specified in conda_env.yml is already installed but with a different version, Conda might raise an error. Make sure the versions specified in the YAML file are compatible with the current environment. Clean Environment: If you're okay with removing the existing environment and recreating it from scratch, you can use conda env remove -n <environment_name> to remove the existing environment before running the update command. Check File Paths: Ensure that the conda_env.yml file is located in the correct directory and that the path is correctly specified in the RUN command. Permissions: Ensure that the user running the Dockerfile has the necessary permissions to create and modify Conda environments and install packages.
Hi,
But its not our dockerfile that problematic. Its the azure's internal. Please look at the screenshot, it says: "/azureml-envs/image-build/lib/python3.8/os.py"
The image when build locally works: I have build the docker image using the same docker file; it builds and runs fine.
I have tried changing the dependencies, even changing the base image to docker/python:3.11 , and trimming them to just one or two. I have tried changing base images but nothing works. And this docker setup is being used in our other projects which I fear will crash now if we ever try to rebuilt the environments. This environment that we are talking about was working fine a month ago and I just added a single pycountry dependency.
Hi @obiii,
Previously i faced some issues with /opt/miniconda/profile.d/conda.sh. Now it was resolved. It is in progress and once i find anything i will update you.
Hi @obiii,
I have run the Docker file successfully and while creating the ML environment, I'm getting below error. It is in progress.
Hi @obiii,
I have run the Docker file successfully and while creating the ML environment, I'm getting below error. It is in progress.
Hi @Junnu-akhila, just to update you, the same works if the environment is built in the registry instead of the workspace, not sure why tho!
Hi @obiii
May I know which python version you are using? Docker file run successfully, After updating python version 3.11 to 3.10 in preprocess_conda_dependencies.yml file. Could you please try with python3.10 version.
Thank you.
Hi @obiii ,
create my_env.yml file by using below code:
$schema: https://azuremlschemas.azureedge.net/latest/environment.schema.json name: newdockerenv build: path: ml_service/docker
please run below command if you get any ml extension issue, Command: az extension add --name azure-cli-ml
you can use below command for ml environment creating in workspace:
az ml environment create --file my_env.yml --resource-group <your RG name> --workspace-name
If you will get any issue, please let me Thank you.
Hi @obiii, Could you please confirm, is this issue resolved or not?
az ml environment create --file my_env.yml --resource-group --workspace-name
I tried your method above: It doesn't work. This is the trace:
The yaml file you provided does not match the prescribed schema for Environment yaml files and/or has the following issues:
Error:
- A least one unrecognized parameter is specified
Details: Validation for EnvironmentSchema failed
(x) build:
- Field may not be null.
(x) path:
- Unknown field.
Resolutions:
- Remove any parameters not prescribed by the Environment schema. Visit this link to refer to the Environment schema if needed: https://aka.ms/ml-cli-v2-environment-yaml-reference. If using the CLI, you can also check the full log in debug mode for more details by adding --debug to the end of your command
Also, even if this works, we do not have my_env.yaml files for all the environments. We have docker.template
file that gets filled according to which environment is being built and it produces a Dockerfile that is used with az ml command
as explained above (referenced).
A Dockerfile produced the template, for a specific preprocess environment is as follows:
# Start with a base image, for example:
# FROM mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04
# Use the provided environment variables for conda and environment file paths
ARG CONDA_FILE
ARG IMAGE_NAME
FROM mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04
RUN echo "CONDA_FILE: conda_dependencies/preprocess_conda_dependencies.yml"
RUN echo "IMAGE_NAME: mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04"
COPY conda_dependencies/preprocess_conda_dependencies.yml conda_env.yml
RUN rm /bin/sh && ln -s /bin/bash /bin/sh
RUN echo "source /opt/miniconda/etc/profile.d/conda.sh && conda activate" >> ~/.bashrc
RUN cat conda_env.yml
RUN source /opt/miniconda/etc/profile.d/conda.sh && \
conda activate && \
conda install conda && \
pip install cmake && \
conda env update -f conda_env.yml --prune
We want to use such Dockerfiles to build the environments: az ml environment create --name "$azureEnvPrefix" --build-context $dockerContext --dockerfile-path Dockerfile --resource-group <someName> --workspace-name <someName>
Hi @obiii
May I know which python version you are using? Docker file run successfully, After updating python version 3.11 to 3.10 in preprocess_conda_dependencies.yml file. Could you please try with python3.10 version.
Thank you.
Hi, tried this, doesn't work. Results in same error.
az ml environment create --file my_env.yml --resource-group --workspace-name
I tried your method above: It doesn't work. This is the trace:
The yaml file you provided does not match the prescribed schema for Environment yaml files and/or has the following issues: Error:
- A least one unrecognized parameter is specified
Details: Validation for EnvironmentSchema failed (x) build:
- Field may not be null.
(x) path:
- Unknown field.
Resolutions:
- Remove any parameters not prescribed by the Environment schema. Visit this link to refer to the Environment schema if needed: https://aka.ms/ml-cli-v2-environment-yaml-reference. If using the CLI, you can also check the full log in debug mode for more details by adding --debug to the end of your command
Also, even if this works, we do not have my_env.yaml files for all the environments. We have
docker.template
file that gets filled according to which environment is being built and it produces a Dockerfile that is used withaz ml command
as explained above (referenced).A Dockerfile produced the template, for a specific preprocess environment is as follows:
# Start with a base image, for example: # FROM mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04 # Use the provided environment variables for conda and environment file paths ARG CONDA_FILE ARG IMAGE_NAME FROM mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04 RUN echo "CONDA_FILE: conda_dependencies/preprocess_conda_dependencies.yml" RUN echo "IMAGE_NAME: mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04" COPY conda_dependencies/preprocess_conda_dependencies.yml conda_env.yml RUN rm /bin/sh && ln -s /bin/bash /bin/sh RUN echo "source /opt/miniconda/etc/profile.d/conda.sh && conda activate" >> ~/.bashrc RUN cat conda_env.yml RUN source /opt/miniconda/etc/profile.d/conda.sh && \ conda activate && \ conda install conda && \ pip install cmake && \ conda env update -f conda_env.yml --prune
We want to use such Dockerfiles to build the environments: az ml environment create --name "$azureEnvPrefix" --build-context $dockerContext --dockerfile-path Dockerfile --resource-group --workspace-name
Hi @obiii ,
you need add some space after build in my_env.yml file:
$schema: https://azuremlschemas.azureedge.net/latest/environment.schema.json name: newdockerenv build: path: ml_service/docker
By using below command, i have successfully created ML environment.
az ml environment create --file my_env.yml --resource-group --workspace-name
I have able to create the environment by using your command,
az ml environment create --name "$azureEnvPrefix" --build-context $dockerContext --dockerfile-path Dockerfile --resource-group --workspace-name
By using python 3.11 version, we facing process 'python' exited with status code 1.
By using python3.10 , Job is succeeded,
Could you please try with above process?
Thank you.
Hi @obiii Can you use the below screenshot by creating the my_env.yml. You need to provide the four spaces in front of path in build section and we tried two ways and environments created.
If you will get any issues after using this approach we will work on it.
Hi @obiii Could you please confirm, is this issue resolved or not?
Hi @obiii Could you please confirm, is this issue resolved or not?
Hi @obiii. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.
Hi @Junnu-akhila
Sorry for late response. I tried it and it creates the environment but it doesn't build:
On checking logs I see:
Hi @obiii, As per your errors, you need to provide Docker file path in Docker context. After Could you try again? Thank you.
Hi @Junnu-akhila I realized. Thanks for correcting. It still fails, even if it does not fail, the solution is not what we are looking for. We cannot create env.yml files for each environment and use az command to create an environment using the env.yml files.
The build logs shows link to a job, which is :
I am not sure why it doesn't build the env in workspace. For now, we have decided to use a shared ML registry for building environments. And interestingly, the same CI pipeline, dockerfile, and context, same code successfully builds the environment in the registry.
Hi @obiii,
We tried, what you suggested, and we are able to create a ML Environment. As you said, now you are using ML registries for ML Environment, and it is working fine. Shall i close this? If you need anything we will work on this. Thank you.