MLOpsPython
MLOpsPython copied to clipboard
MLOPs image: Dataprep failed to start Engine when dotnetcore2 '*.a' files are missing
I'm trying to reduce the MLOps image size, because its 4.35GB. I followed some steps to reduce docker image size. Clean conda, remove file that are not needed.
Reproduce:
To reproduce inside current image:
# Start bash the image
docker run -it --rm -v ~/.azure:/root/.azure/ -v ~/.azureml:/root/.azureml/ mcr.microsoft.com/mlops/python
Steps to cleanup, which should end-up in the Docker build of the image.
# The test: A working engine:
python -c 'import logging; logging.basicConfig(level="DEBUG"); from azureml.dataprep.api.engineapi import api; api.EngineAPI()'
# First cleanup step (~ 725Mb):
conda clean -tpiy
# Still working engine:
python -c 'import logging; logging.basicConfig(level="DEBUG"); from azureml.dataprep.api.engineapi import api; api.EngineAPI()'
# Second cleanup step (19392 files)
find /usr/local/envs/mlopspython_ci/ -follow -type f -name '*.pyc' -delete
# Still working engine:
python -c 'import logging; logging.basicConfig(level="DEBUG"); from azureml.dataprep.api.engineapi import api; api.EngineAPI()'
# 3th cleanup step
find /usr/local/envs/mlopspython_ci/ -follow -type f -name '*.a' -delete
# Broken engine:
python -c 'import logging; logging.basicConfig(level="DEBUG"); from azureml.dataprep.api.engineapi import api; api.EngineAPI()'
The required "*.a"-files are part of the site-packages/dotnetcore2/bin/shared/Microsoft.NETCore.App
. I ended up excluding this path. My. final image is 1,8 Gb large.
Maybe the files are required by design, then the error message could be more clear. Or the auto-downloader should fetch them.
note: The MLOps images is build
FROM conda/miniconda3
, the git-repo behind it mentions that the repo is deprecated in favor of ContinuumIO.