server UNAVAILABLE: Internal: Unable to set NUMA memory policy: Operation not permitted

Description When i tried to NUMA Optimization, the error message is occurred.

The error message is like below.

E0915 09:25:07.284131 120033 model_repository_manager.cc:1152] failed to load $'{MODEL_NAME}' version 1: Internal: Unable to set NUMA memory policy: Operation not permitted

Triton Information triton version = r22.02

Built from following commands

./build.py --image base,nvcr.io/nvidia/pytorch:${ver}-py3 --cmake-dir=./build \
        --build-dir=/tmp/citritonbuild --enable-logging --enable-stats --enable-tracing \
        --enable-metrics --enable-gpu-metrics --enable-gpu --endpoint=http --endpoint=grpc \
        --repo-tag=common:r${ver} --repo-tag=core:r${ver} --repo-tag=backend:r${ver} --repo-tag=thirdparty:r${ver} \
        --backend=pytorch:r${ver} --backend=python:r${ver} --backend=onnxruntime:r${ver} --repoagent=checksum:r${ver} \
        --backend=ensemble:r${ver}

To Reproduce

Add host_policy to config.pbtxt like as

instance_group [
  {
    count: 1
    kind: KIND_CPU
    host_policy: "policy_numa"
  }
]

This model is built from torchscript, and it works well when using GPU/CPU (without NUMA)

Run tritonserver

Commands:

CUDA_VISIBLE_DEVICES=-1 tritonserver --host-policy=policy_numa,numa-node=0 --host-policy=policy_numa,cpu-cores=0-15 --model-repository ${MODEL_REPOSITORY}

Error messages

WARNING: [Torch-TensorRT] - Unable to read CUDA capable devices. Return status: 100
I0915 09:24:56.045876 120033 libtorch.cc:1306] TRITONBACKEND_Initialize: pytorch
I0915 09:24:56.045963 120033 libtorch.cc:1316] Triton TRITONBACKEND API version: 1.8
I0915 09:24:56.045968 120033 libtorch.cc:1322] 'pytorch' TRITONBACKEND API version: 1.8
I0915 09:24:56.072335 120033 onnxruntime.cc:2319] TRITONBACKEND_Initialize: onnxruntime
I0915 09:24:56.072390 120033 onnxruntime.cc:2329] Triton TRITONBACKEND API version: 1.8
I0915 09:24:56.072404 120033 onnxruntime.cc:2335] 'onnxruntime' TRITONBACKEND API version: 1.8
I0915 09:24:56.072415 120033 onnxruntime.cc:2365] backend configuration:
{}
W0915 09:25:04.175787 120033 pinned_memory_manager.cc:236] Unable to allocate pinned system memory, pinned memory pool will not be available: no CUDA-capable device is detected
I0915 09:25:04.183820 120033 cuda_memory_manager.cc:115] CUDA memory pool disabled
I0915 09:25:04.185852 120033 model_repository_manager.cc:994] loading: ${MODEL_NAME}:1
I0915 09:25:07.283670 120033 libtorch.cc:253] Optimized execution is disabled for model instance '${MODEL_NAME}'
I0915 09:25:07.283702 120033 libtorch.cc:271] Inference Mode is enabled for model instance '${MODEL_NAME}'
I0915 09:25:07.283719 120033 libtorch.cc:346] NvFuser is not specified for model instance '${MODEL_NAME}'
I0915 09:25:07.284028 120033 libtorch.cc:1378] TRITONBACKEND_ModelFinalize: delete model state
E0915 09:25:07.284131 120033 model_repository_manager.cc:1152] failed to load '${MODEL_NAME}' version 1: Internal: Unable to set NUMA memory policy: Operation not permitted

I0915 09:25:15.928282 120033 server.cc:549]
+-------------+-----------------------------------------------------------------+--------+
| Backend     | Path                                                            | Config |
+-------------+-----------------------------------------------------------------+--------+
| pytorch     | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so         | {}     |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {}     |
| python      | /opt/tritonserver/backends/python/libtriton_python.so           | {}     |
+-------------+-----------------------------------------------------------------+--------+

I0915 09:25:15.928351 120033 server.cc:592]
+------------------------------------+---------+----------------------------------------------------------------------------------+
| Model                              | Version | Status                                                                           |
+------------------------------------+---------+----------------------------------------------------------------------------------+
| ${MODEL_NAME}        | 1       | UNAVAILABLE: Internal: Unable to set NUMA memory policy: Operation not permitted |
+------------------------------------+---------+----------------------------------------------------------------------------------+

+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                        |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                       |
| server_version                   | 2.19.0                                                                                                                                                                                       |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0]         | ##################                                                                                                                                               |                                                                                                                                              |
| model_control_mode               | MODE_NONE                                                                                                                                                                                    |
| strict_model_config              | 1                                                                                                                                                                                            |
| rate_limit                       | OFF                                                                                                                                                                                          |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                                    |
| response_cache_byte_size         | 0                                                                                                                                                                                            |
| min_supported_compute_capability | 6.0                                                                                                                                                                                          |
| strict_readiness                 | 1                                                                                                                                                                                            |
| exit_timeout                     | 30                                                                                                                                                                                           |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Sep 15 '22 09:09 DongChanS

It looks like the name of the model is: ${MODEL_NAME}. Is that correct?

Sep 23 '22 02:09 jbkyang-nvi

Closing issue due to lack of activity. Please re-open the issue if you would like to follow up with this.

Oct 10 '22 23:10 krishung5

server server copied to clipboard

UNAVAILABLE: Internal: Unable to set NUMA memory policy: Operation not permitted

server
server copied to clipboard