Can't load custom python backend model and python backend model with unique conda environments.
Description I'm having an issue starting up a tritonserver with a model repository containing one model using a custom python backend plus another model using the standard python backend where both of them have different conda environments. Each of the models are able to be loaded individually and only fail when loaded together.
Triton Information What version of Triton are you using? 25.07, same behavior also observed on 24.03.
Are you using the Triton container or did you build it yourself? I'm using the nvcr.io triton images + a custom python backend.
To Reproduce Model 1 Config:
backend: "custom-python-backend"
max_batch_size: 0
input [
{
name: "a"
data_type: TYPE_INT64
dims: [-1]
}
]
input [
{
name: "b"
data_type: TYPE_INT64
dims: [-1]
}
]
output [
{
name: "add"
data_type: TYPE_INT64
dims: [-1]
}
]
output [
{
name: "sub"
data_type: TYPE_INT64
dims: [-1]
}
]
version_policy: { specific: { versions: [3]}}
parameters: {key: "EXECUTION_ENV_PATH" value: {string_value: "$$TRITON_MODEL_DIRECTORY/4b9dc66bf7c887c43bf4a9f0177f73e87e7b553917b0b3276ca276c09da2e87d.tar.gz"}}
Model 2 Config:
name: "add_sub_pytorch"
backend: "python"
input [
{
name: "INPUT0"
data_type: TYPE_FP32
dims: [ 4 ]
}
]
input [
{
name: "INPUT1"
data_type: TYPE_FP32
dims: [ 4 ]
}
]
output [
{
name: "OUTPUT0"
data_type: TYPE_FP32
dims: [ 4 ]
}
]
output [
{
name: "OUTPUT1"
data_type: TYPE_FP32
dims: [ 4 ]
}
]
instance_group [{ kind: KIND_CPU }]
version_policy: { specific: { versions: [1]}}
parameters: {key: "EXECUTION_ENV_PATH" value: {string_value: "$$TRITON_MODEL_DIRECTORY/a20ec5948d8ffa118586262a702218acdfef6462b9658ea1f9f76b9309c17e43.tar.gz"}}
Below is the output for the tritonserver when I try to load them together:
> docker run -v ./data:/data --shm-size 4G -v ./custom-python-backend:/opt/tritonserver/backends/custom-python-backend -p:8000 -p:8001 -p:8002 nvcr.io/nvidia/tritonserver:25.07-py3 tritonserver --model-repository=/data/serving/example-server --strict-readiness true
=============================
== Triton Inference Server ==
=============================
NVIDIA Release 25.07 (build 193148794)
Triton Server Version 2.59.1
Copyright (c) 2018-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.
GOVERNING TERMS: The software and materials are governed by the NVIDIA Software License Agreement
(found at https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-software-license-agreement/)
and the Product-Specific Terms for NVIDIA AI Products
(found at https://www.nvidia.com/en-us/agreements/enterprise-software/product-specific-terms-for-ai-products/).
WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.
Use the NVIDIA Container Toolkit to start this container with GPU support; see
https://docs.nvidia.com/datacenter/cloud-native/ .
W0816 14:53:00.196979 1 pinned_memory_manager.cc:273] "Unable to allocate pinned system memory, pinned memory pool will not be available: CUDA driver version is insufficient for CUDA runtime version"
E0816 14:53:00.197091 1 server.cc:248] "CudaDriverHelper has not been initialized."
I0816 14:53:00.197028 1 cuda_memory_manager.cc:117] "CUDA memory pool disabled"
I0816 14:53:00.557296 1 model_lifecycle.cc:473] "loading: add_sub_pytorch:1"
I0816 14:53:00.566654 1 python_be.cc:1851] "Using Python execution env /data/serving/example-server/add_sub_pytorch/a20ec5948d8ffa118586262a702218acdfef6462b9658ea1f9f76b9309c17e43.tar.gz"
I0816 14:53:00.579464 1 model_lifecycle.cc:473] "loading: add_sub:3"
I0816 14:53:00.594293 1 python_be.cc:1851] "Using Python execution env /data/serving/example-server/add_sub/4b9dc66bf7c887c43bf4a9f0177f73e87e7b553917b0b3276ca276c09da2e87d.tar.gz"
I0816 14:53:03.469616 1 model_lifecycle.cc:789] "failed to load 'add_sub_pytorch'"
E0816 14:53:03.469577 1 model_lifecycle.cc:654] "failed to load 'add_sub_pytorch' version 1: Internal: Path /tmp/python_env_kU1zLS/0/bin/activate does not exist. The Python environment should contain an 'activate' script."
E0816 14:53:04.202211 1 model_lifecycle.cc:654] "failed to load 'add_sub' version 3: Internal: archive_write_header() failed with error code = -25 error message is Hard-link target 'lib/python3.11/site-packages/conda_pack/scripts/posix/activate' does not exist."
I0816 14:53:04.202238 1 model_lifecycle.cc:789] "failed to load 'add_sub'"
I0816 14:53:04.202289 1 server.cc:611]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+
I0816 14:53:04.202310 1 server.cc:638]
+---------+-------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
+---------+-------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","default-max-batch-size":"4"}} |
| mlflow | /opt/tritonserver/backends/mlflow/model.py | {"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","default-max-batch-size":"4"}} |
+---------+-------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0816 14:53:04.202334 1 server.cc:681]
+-----------------+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
+-----------------+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| add_sub | 3 | UNAVAILABLE: Internal: archive_write_header() failed with error code = -25 error message is Hard-link target 'lib/python3.11/site-packages/conda_pack/scripts/posix/activate' does not exist. |
| add_sub_pytorch | 1 | UNAVAILABLE: Internal: Path /tmp/python_env_kU1zLS/0/bin/activate does not exist. The Python environment should contain an 'activate' script. |
+-----------------+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0816 14:53:04.202449 1 metrics.cc:783] "Collecting CPU metrics"
I0816 14:53:04.202613 1 tritonserver.cc:2598]
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.59.1 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data parameters statistics trace logging |
| model_repository_path[0] | /data/serving/example-server |
| model_control_mode | MODE_NONE |
| strict_model_config | 0 |
| model_config_name | |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
| cache_enabled | 0 |
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0816 14:53:04.202669 1 server.cc:312] "Waiting for in-flight requests to complete."
I0816 14:53:04.202673 1 server.cc:328] "Timeout 30: Found 0 model versions that have in-flight inferences"
I0816 14:53:04.202713 1 server.cc:343] "All models are stopped, unloading models"
I0816 14:53:04.202715 1 server.cc:352] "Timeout 30: Found 0 live models and 0 in-flight non-inference requests"
error: creating server: Internal - failed to load all models
Expected behavior Both of my models should be able to be loaded successfully by the python backend.
@UnyieldingOrca Your conda tar is incorrect. Check the following error from the logs.
"failed to load 'add_sub_pytorch' version 1: Internal: Path /tmp/python_env_kU1zLS/0/bin/activate does not exist. The Python environment should contain an 'activate' script."
How are you creating this tar? Are you using conda-pack or tar command of unix?
Maybe you can try: tritonserver xxx --model-load-thread-count=1. I encountered the same issue when starting model1: python backend, model2: vllm backend, and both models have EXECUTION_ENV_PATH.