server
server copied to clipboard
UNAVAILABLE: Internal: Unable to set NUMA memory policy: Operation not permitted
Description When i tried to NUMA Optimization, the error message is occurred.
The error message is like below.
E0915 09:25:07.284131 120033 model_repository_manager.cc:1152] failed to load $'{MODEL_NAME}' version 1: Internal: Unable to set NUMA memory policy: Operation not permitted
Triton Information triton version = r22.02
Built from following commands
./build.py --image base,nvcr.io/nvidia/pytorch:${ver}-py3 --cmake-dir=./build \
--build-dir=/tmp/citritonbuild --enable-logging --enable-stats --enable-tracing \
--enable-metrics --enable-gpu-metrics --enable-gpu --endpoint=http --endpoint=grpc \
--repo-tag=common:r${ver} --repo-tag=core:r${ver} --repo-tag=backend:r${ver} --repo-tag=thirdparty:r${ver} \
--backend=pytorch:r${ver} --backend=python:r${ver} --backend=onnxruntime:r${ver} --repoagent=checksum:r${ver} \
--backend=ensemble:r${ver}
To Reproduce
- Add host_policy to
config.pbtxt
like as
instance_group [
{
count: 1
kind: KIND_CPU
host_policy: "policy_numa"
}
]
This model is built from torchscript
, and it works well when using GPU/CPU (without NUMA)
- Run tritonserver
Commands:
CUDA_VISIBLE_DEVICES=-1 tritonserver --host-policy=policy_numa,numa-node=0 --host-policy=policy_numa,cpu-cores=0-15 --model-repository ${MODEL_REPOSITORY}
- Error messages
WARNING: [Torch-TensorRT] - Unable to read CUDA capable devices. Return status: 100
I0915 09:24:56.045876 120033 libtorch.cc:1306] TRITONBACKEND_Initialize: pytorch
I0915 09:24:56.045963 120033 libtorch.cc:1316] Triton TRITONBACKEND API version: 1.8
I0915 09:24:56.045968 120033 libtorch.cc:1322] 'pytorch' TRITONBACKEND API version: 1.8
I0915 09:24:56.072335 120033 onnxruntime.cc:2319] TRITONBACKEND_Initialize: onnxruntime
I0915 09:24:56.072390 120033 onnxruntime.cc:2329] Triton TRITONBACKEND API version: 1.8
I0915 09:24:56.072404 120033 onnxruntime.cc:2335] 'onnxruntime' TRITONBACKEND API version: 1.8
I0915 09:24:56.072415 120033 onnxruntime.cc:2365] backend configuration:
{}
W0915 09:25:04.175787 120033 pinned_memory_manager.cc:236] Unable to allocate pinned system memory, pinned memory pool will not be available: no CUDA-capable device is detected
I0915 09:25:04.183820 120033 cuda_memory_manager.cc:115] CUDA memory pool disabled
I0915 09:25:04.185852 120033 model_repository_manager.cc:994] loading: ${MODEL_NAME}:1
I0915 09:25:07.283670 120033 libtorch.cc:253] Optimized execution is disabled for model instance '${MODEL_NAME}'
I0915 09:25:07.283702 120033 libtorch.cc:271] Inference Mode is enabled for model instance '${MODEL_NAME}'
I0915 09:25:07.283719 120033 libtorch.cc:346] NvFuser is not specified for model instance '${MODEL_NAME}'
I0915 09:25:07.284028 120033 libtorch.cc:1378] TRITONBACKEND_ModelFinalize: delete model state
E0915 09:25:07.284131 120033 model_repository_manager.cc:1152] failed to load '${MODEL_NAME}' version 1: Internal: Unable to set NUMA memory policy: Operation not permitted
I0915 09:25:15.928282 120033 server.cc:549]
+-------------+-----------------------------------------------------------------+--------+
| Backend | Path | Config |
+-------------+-----------------------------------------------------------------+--------+
| pytorch | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so | {} |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {} |
| python | /opt/tritonserver/backends/python/libtriton_python.so | {} |
+-------------+-----------------------------------------------------------------+--------+
I0915 09:25:15.928351 120033 server.cc:592]
+------------------------------------+---------+----------------------------------------------------------------------------------+
| Model | Version | Status |
+------------------------------------+---------+----------------------------------------------------------------------------------+
| ${MODEL_NAME} | 1 | UNAVAILABLE: Internal: Unable to set NUMA memory policy: Operation not permitted |
+------------------------------------+---------+----------------------------------------------------------------------------------+
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.19.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0] | ################## | |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
It looks like the name of the model is: ${MODEL_NAME}
. Is that correct?
Closing issue due to lack of activity. Please re-open the issue if you would like to follow up with this.