yocto-gl icon indicating copy to clipboard operation
yocto-gl copied to clipboard

[BUG] RuntimeError: Docker build failed.

Open FahriBilici opened this issue 1 year ago • 8 comments

Issues Policy acknowledgement

  • [X] I have read and agree to submit bug reports in accordance with the issues policy

Willingness to contribute

Yes. I can contribute a fix for this bug independently.

MLflow version

  • Client: 2.5.0
  • Tracking server: 2.5.0

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): MacOS
  • Python version: 3.10.12
  • yarn version, if running the dev UI:-

Describe the problem

I am trying to dockerize my ml model but if I use conda or default env manager it throws docker build failed. If I use local as env manager I can build docker but it doesn't install necessary libraries and it's shutting down immediately. What should I do?

mlflow models build-docker --model-uri "models:/sentiment/1" --name "docker-sentiment" --env-manager conda

Tracking information

mlflow models build-docker --model-uri "some model" --name "a name "

Code to reproduce issue

REPLACE_ME

Stack trace

REPLACE_ME

Other info / logs

REPLACE_ME

What component(s) does this bug affect?

  • [ ] area/artifacts: Artifact stores and artifact logging
  • [ ] area/build: Build and test infrastructure for MLflow
  • [ ] area/docs: MLflow documentation pages
  • [ ] area/examples: Example code
  • [ ] area/gateway: AI Gateway service, Gateway client APIs, third-party Gateway integrations
  • [ ] area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • [ ] area/models: MLmodel format, model serialization/deserialization, flavors
  • [ ] area/recipes: Recipes, Recipe APIs, Recipe configs, Recipe Templates
  • [ ] area/projects: MLproject format, project running backends
  • [ ] area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • [ ] area/server-infra: MLflow Tracking server backend
  • [ ] area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

  • [ ] area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • [X] area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • [ ] area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • [ ] area/windows: Windows support

What language(s) does this bug affect?

  • [ ] language/r: R APIs and clients
  • [ ] language/java: Java APIs and clients
  • [ ] language/new: Proposals for new client languages

What integration(s) does this bug affect?

  • [ ] integrations/azure: Azure and Azure ML integrations
  • [ ] integrations/sagemaker: SageMaker integrations
  • [ ] integrations/databricks: Databricks integrations

FahriBilici avatar Aug 09 '23 13:08 FahriBilici

The problem happens in this step:

#17 2.315 2023/08/09 20:15:06 INFO mlflow.models.container: creating and activating custom environment
#17 3.235 Collecting package metadata (repodata.json): ...working... Traceback (most recent call last):
#17 354.9   File "<string>", line 1, in <module>
#17 354.9   File "/miniconda/lib/python3.11/site-packages/mlflow/models/container/__init__.py", line 115, in _install_pyfunc_deps
#17 354.9     raise Exception("Failed to create model environment.")
#17 354.9 Exception: Failed to create model environment.
#17 ERROR: process "/bin/sh -c python -c                     'from mlflow.models.container import _install_pyfunc_deps;                    _install_pyfunc_deps(                        \"/opt/ml/model\",                         install_mlflow=True,                         enable_mlserver=False,                         env_manager=\"conda\")'" did not complete successfully: exit code: 1
------
 > [13/14] RUN python -c                     'from mlflow.models.container import _install_pyfunc_deps;                    _install_pyfunc_deps(                        "/opt/ml/model",                         install_mlflow=True,                         enable_mlserver=False,                         env_manager="conda")':
2.315 2023/08/09 20:15:06 INFO mlflow.models.container: creating and activating custom environment
Traceback (most recent call last):
354.9   File "<string>", line 1, in <module>
354.9   File "/miniconda/lib/python3.11/site-packages/mlflow/models/container/__init__.py", line 115, in _install_pyfunc_deps
354.9     raise Exception("Failed to create model environment.")
354.9 Exception: Failed to create model environment.```

FahriBilici avatar Aug 09 '23 20:08 FahriBilici

This line command failed in your case:

conda env create -n custom_env -f {env_path_dst}

so could you provide the output log of subprocess that executing this command ?

or you can solely run conda env create -n custom_env -f {env_path_dst} and then send me its output. the env_path_dst is the conda yaml file that you can find it in your logged model path.

WeichenXu123 avatar Aug 10 '23 02:08 WeichenXu123

This line command failed in your case:

conda env create -n custom_env -f {env_path_dst}

so could you provide the output log of subprocess that executing this command ?

or you can solely run conda env create -n custom_env -f {env_path_dst} and then send me its output. the env_path_dst is the conda yaml file that you can find it in your logged model path.

I tried to create an env with conda.yaml generated by generate-dockerfile command. It creates the env without a problem.

FahriBilici avatar Aug 11 '23 09:08 FahriBilici

Could you also try this command solely ?

bash -c conda env create -n custom_env -f {env_path_dst}

WeichenXu123 avatar Aug 15 '23 00:08 WeichenXu123

I have been running into the same issue trying to deploy to Sagemaker. It seems to work with an older version of mlflow (1.29.0) but fails with mlflow 2.3+.

nfarley-soaren avatar Aug 15 '23 18:08 nfarley-soaren

@mlflow/mlflow-team Please assign a maintainer and start triaging this issue.

mlflow-automation avatar Aug 17 '23 00:08 mlflow-automation

I have the same issue, but different error log:

#7 132.9 E: Sub-process /usr/bin/dpkg returned an error code (1)
#7 ERROR: process "/bin/sh -c DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get install -y --no-install-recommends          wget          curl          nginx          ca-certificates          bzip2          build-essential          cmake          openjdk-8-jdk          git-core          maven     && rm -rf /var/lib/apt/lists/*" did not complete successfully: exit code: 100
------
 > [ 3/21] RUN DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get install -y --no-install-recommends          wget          curl          nginx          ca-certificates          bzip2          build-essential          cmake          openjdk-8-jdk          git-core          maven     && rm -rf /var/lib/apt/lists/*:
132.7 update-alternatives: using /usr/lib/jvm/java-8-openjdk-amd64/bin/wsgen to provide /usr/bin/wsgen (wsgen) in auto mode
132.7 update-alternatives: using /usr/lib/jvm/java-8-openjdk-amd64/bin/jcmd to provide /usr/bin/jcmd (jcmd) in auto mode
132.7 Setting up openjdk-8-jdk:amd64 (8u402-ga-2ubuntu1~20.04) ...
132.7 update-alternatives: using /usr/lib/jvm/java-8-openjdk-amd64/bin/appletviewer to provide /usr/bin/appletviewer (appletviewer) in auto mode
132.8 update-alternatives: using /usr/lib/jvm/java-8-openjdk-amd64/bin/jconsole to provide /usr/bin/jconsole (jconsole) in auto mode
132.8 Processing triggers for libgdk-pixbuf2.0-0:amd64 (2.40.0+dfsg-3ubuntu0.4) ...
132.8 Processing triggers for libc-bin (2.31-0ubuntu9.14) ...
132.9 Errors were encountered while processing:
132.9  openjdk-11-jre-headless:amd64
132.9 E: Sub-process /usr/bin/dpkg returned an error code (1)
------
Dockerfile:6
--------------------
   4 |     
   5 |     RUN apt-get -y update
   6 | >>> RUN DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get install -y --no-install-recommends          wget          curl          nginx          ca-certificates          bzip2          build-essential          cmake          openjdk-8-jdk          git-core          maven     && rm -rf /var/lib/apt/lists/*
   7 |     
   8 |     
--------------------
ERROR: failed to solve: process "/bin/sh -c DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get install -y --no-install-recommends          wget          curl          nginx          ca-certificates          bzip2          build-essential          cmake          openjdk-8-jdk          git-core          maven     && rm -rf /var/lib/apt/lists/*" did not complete successfully: exit code: 100

fschlz avatar Apr 19 '24 17:04 fschlz