server icon indicating copy to clipboard operation
server copied to clipboard

Impossible to load custom python backend

Open RegaliaXYZ opened this issue 2 years ago • 4 comments

Description I'm trying to load a very simple custom python environment and it's failing

Triton Information r22.07

Are you using the Triton container or did you build it yourself? Triton container

To Reproduce

  • conda create -n python3.8 python=3.8
  • conda activate python3.8
  • export PYTHONNOUSERSITE=True
  • conda install -c conda-forge tensorflow
  • conda install -c conda-forge numpy
  • conda install conda-pack
  • conda-pack

config.pbtxt has parameters: { key: "EXECUTION_ENV_PATH", value: {string_value: "$$TRITON_MODEL_DIRECTORY/python3.8.tar.gz"} }

and the folder structure is:

model/ config.pbtxt python3.8.tar.gz 1/ model.py

Expected behavior Expected behavior: the model loads correctly and can use the tensorflow dependency that is installed in the custom python env Actual behavior:

0908 07:24:37.581780 57 pb_stub.cc:241] Failed to initialize Python stub for auto-complete: ImportError:

IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!

Importing the numpy C-extensions failed. This error can happen for many reasons, often due to issues with your setup or how NumPy was installed.

We have compiled some common reasons and troubleshooting tips at:

https://numpy.org/devdocs/user/troubleshooting-importerror.html

Please note and check the following:

  • The Python version is: Python3.8 from "/tmp/python_env_0AYzlh/0/bin/python3"
  • The NumPy version is: "1.23.2"

and make sure that they are the versions you expect. Please carefully study the documentation linked above for further help.

Original error was: No module named 'numpy.core._multiarray_umath'

At: /tmp/python_env_0AYzlh/0/lib/python3.8/site-packages/numpy/core/init.py(52): (219): _call_with_frames_removed (848): exec_module (686): _load_unlocked (975): _find_and_load_unlocked (991): _find_and_load (219): _call_with_frames_removed (1050): _handle_fromlist /tmp/python_env_0AYzlh/0/lib/python3.8/site-packages/numpy/init.py(140): (219): _call_with_frames_removed (848): exec_module (686): _load_unlocked (975): _find_and_load_unlocked (991): _find_and_load /opt/tritonserver/backends/python/triton_python_backend_utils.py(27): (219): _call_with_frames_removed (848): exec_module (686): _load_unlocked (975): _find_and_load_unlocked (991): _find_and_load /models/nlu-oxs-bnk02/1/model.py(1): (219): _call_with_frames_removed (848): exec_module (686): _load_unlocked (975): _find_and_load_unlocked (991): _find_and_load

RegaliaXYZ avatar Sep 08 '22 07:09 RegaliaXYZ

Hi @RegaliaXYZ, just wanted to clarify, the model repository structure you shared is like this:

<model-repository-path>/
  model/
    config.pbtxt
    python3.8.tar.gz
    1/
      model.py

I followed the steps you shared but couldn't reproduce this issue. After setting up the environment, I ran our model and it seems to be working. One thing that I found is that my NumPy version is "1.23.1`. Could you check if there is some version mismatch for NumPy? CC @Tabrizian Do you see anything which could help here?

krishung5 avatar Sep 09 '22 20:09 krishung5

@krishung5

Yes my model repository is as you said. However my numpy version when doing the steps is numpy 1.23.3 py38h266fe8d_0 conda-forge Did you install using pip or conda @krishung5 ?

RegaliaXYZ avatar Sep 13 '22 08:09 RegaliaXYZ

@RegaliaXYZ I use conda install -c conda-forge numpy to install numpy as you shared in the steps to reproduce.

krishung5 avatar Sep 13 '22 19:09 krishung5

@krishung5

I don't know if it helps, but i'm on mac os M1. But it shouldn't impact any of the steps

RegaliaXYZ avatar Sep 13 '22 21:09 RegaliaXYZ

Still no update as to what could cause this?

RegaliaXYZ avatar Oct 03 '22 13:10 RegaliaXYZ

Hi @RegaliaXYZ, thanks for following up. We filed a ticket to look into it. We will update here once we have some progress. CC @Tabrizian

krishung5 avatar Oct 03 '22 18:10 krishung5

I was also having this issue with a conda-pack execution environment built on an M1 Mac. The issue was fixed by building the environment on an Ubuntu-based system, implying a possible issue with Apple Silicon here.

I did also notice, with the same commands as @RegaliaXYZ posted in his issue description, that the NumPy versions when building from conda-forge were 1.23.3 on MacOS and 1.23.1 on the Ubuntu-based system.

michaelhagel avatar Oct 07 '22 15:10 michaelhagel

Not sure if the issue is only in building the environment, I've tried packaging the environment in WSL on a windows laptop and then starting the server on my mac machine and the server failed to start too.

Even with the numpy version being 1.23.1

RegaliaXYZ avatar Oct 12 '22 13:10 RegaliaXYZ

Not sure if the issue is only in building the environment, I've tried packaging the environment in WSL on a windows laptop and then starting the server on my mac machine and the server failed to start too.

The conda-pack environment is dependent to the OS that it was created on. You cannot move a conda package built on one OS and move it to another OS. Mac is not a platform officially supported by the Triton team and we have not tested conda execution environments on Mac. If you managed to find a solution, please post it here as it would be helpful to future users facing similar issue.

Tabrizian avatar Oct 14 '22 20:10 Tabrizian

Closing due to in-activity.

Tabrizian avatar Oct 31 '22 20:10 Tabrizian

I was facing the same issue when trying to deploy a model with a custom python env to a triton server running with seldon-core v2 on a M1 mac.

I was able to solve the issue by creating the custom python env inside a ubuntu pod running in the kubernetes cluster which also runs the triton server. I used the ubuntu image version describes in point 6 in important notes regarding the custom env creation

Here is a step by step solution that works for me:

# Create a ubuntu pod with the same version as mentioned in point 6 of https://github.com/triton-inference-server/python_backend#important-notes
kubectl apply -f ubuntu.yaml

# Shell into the ubuntu pod
kubectl exec --stdin --tty ubuntu -- /bin/bash

# Prepare ubuntu pod
apt update && apt upgrade
apt install wget
cd /tmp

# Install conda, see: https://github.com/conda-forge/miniforge
wget -O Miniforge3.sh "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3.sh -b -p "${HOME}/conda"
source "${HOME}/conda/etc/profile.d/conda.sh"

# Create the custom env
conda create -n triton-3.8 python=3.8
conda activate triton-3.8
export PYTHONNOUSERSITE=True
conda install conda-pack

# TODO Add custom packages here
conda install -c conda-forge numpy=1.23.5
conda install -c conda-forge scikit-learn

# Pack
conda-pack

# Exit ubuntu container shell
exit

# Copy the packed env to your local machine for further usage
# Don't remove / before tmp
# TODO: Change path
kubectl cp default/ubuntu:/tmp/ /Users/nw/Downloads -c ubuntu

kubectl delete -f ubuntu.yaml

Niklas2501 avatar Mar 15 '23 10:03 Niklas2501