BentoML
BentoML copied to clipboard
bug: error: creating server: Internal - failed to stat file ./model_repository
Describe the bug
I am trying to run the Pytorch Triton example in the repository. I have followed the README.md file and it seems to work fine. However, at step 4 in the Instruction section. I encounter this error after running the Docker error: creating server: Internal - failed to stat file ./model_repository
This is the whole output:
2023-08-23T07:42:13+0000 [INFO] [cli] Service loaded from Bento directory: bentoml.Service(tag="triton-integration-pytorch:withgpu", path="/home/bentoml/bento/")
2023-08-23T07:42:13+0000 [INFO] [cli] Environ for worker 0: set CPU thread count to 12
2023-08-23T07:42:13+0000 [INFO] [cli] Prometheus metrics for HTTP BentoServer from "/home/bentoml/bento" can be accessed at http://localhost:3000/metrics.
2023-08-23T07:42:14+0000 [INFO] [cli] Starting production HTTP BentoServer from "/home/bentoml/bento" listening on http://0.0.0.0:3000 (Press CTRL+C to quit)
W0823 07:42:14.602007 32 pinned_memory_manager.cc:236] Unable to allocate pinned system memory, pinned memory pool will not be available: CUDA driver version is insufficient for CUDA runtime version
I0823 07:42:14.603316 32 cuda_memory_manager.cc:115] CUDA memory pool disabled
Error: Failed to initialize NVML
W0823 07:42:14.633157 32 metrics.cc:785] DCGM unable to start: DCGM initialization error
I0823 07:42:14.633736 32 metrics.cc:757] Collecting CPU metrics
I0823 07:42:14.634028 32 tritonserver.cc:2264]
+----------------------------------+--------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+--------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.29.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_con |
| | figuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace logging |
| model_repository_path[0] | ./model_repository |
| model_control_mode | MODE_EXPLICIT |
| startup_models_0 | torchscript_yolov5s |
| strict_model_config | 0 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+--------------------------------------------------------------------------------------------------------+
I0823 07:42:14.635074 32 server.cc:261] No server context available. Exiting immediately.
error: creating server: Internal - failed to stat file ./model_repository
To reproduce
No response
Expected behavior
No response
Environment
Environment variable
BENTOML_DEBUG=''
BENTOML_QUIET=''
BENTOML_BUNDLE_LOCAL_BUILD=''
BENTOML_DO_NOT_TRACK=''
BENTOML_CONFIG=''
BENTOML_CONFIG_OPTIONS=''
BENTOML_PORT=''
BENTOML_HOST=''
BENTOML_API_WORKERS=''
System information
bentoml: 1.1.1
python: 3.9.17
platform: Linux-6.2.0-26-generic-x86_64-with-glibc2.35
uid_gid: 1000:1000
conda: 23.5.2
in_conda_env: True
conda_packages
name: bento
channels:
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- _openmp_mutex=5.1=1_gnu
- ca-certificates=2023.05.30=h06a4308_0
- ld_impl_linux-64=2.38=h1181459_1
- libffi=3.4.4=h6a678d5_0
- libgcc-ng=11.2.0=h1234567_1
- libgomp=11.2.0=h1234567_1
- libstdcxx-ng=11.2.0=h1234567_1
- ncurses=6.4=h6a678d5_0
- openssl=3.0.10=h7f8727e_0
- pip=23.2.1=py39h06a4308_0
- python=3.9.17=h955ad1f_0
- readline=8.2=h5eee18b_0
- setuptools=68.0.0=py39h06a4308_0
- sqlite=3.41.2=h5eee18b_0
- tk=8.6.12=h1ccaba5_0
- wheel=0.38.4=py39h06a4308_0
- xz=5.4.2=h5eee18b_0
- zlib=1.2.13=h5eee18b_0
- pip:
- absl-py==1.4.0
- aiohttp==3.8.5
- aiosignal==1.3.1
- albumentations==1.3.1
- annotated-types==0.5.0
- anyio==3.7.1
- appdirs==1.4.4
- asgiref==3.7.2
- astunparse==1.6.3
- async-timeout==4.0.3
- attrs==23.1.0
- bentoml==1.1.1
- blinker==1.6.2
- brotli==1.0.9
- build==0.10.0
- cachetools==5.3.1
- cattrs==23.1.2
- certifi==2023.7.22
- charset-normalizer==3.2.0
- circus==0.18.0
- click==8.1.7
- click-option-group==0.5.6
- cloudpickle==2.2.1
- cmake==3.27.2
- coloredlogs==15.0.1
- configargparse==1.7
- contextlib2==21.6.0
- contourpy==1.1.0
- cycler==0.11.0
- deepmerge==1.1.0
- deprecated==1.2.14
- exceptiongroup==1.1.3
- filelock==3.12.2
- filetype==1.2.0
- flask==2.3.2
- flask-basicauth==0.2.0
- flask-cors==4.0.0
- flatbuffers==2.0.7
- fonttools==4.42.1
- frozenlist==1.4.0
- fs==2.4.16
- gast==0.4.0
- gevent==23.7.0
- geventhttpclient==2.0.2
- google-auth==2.22.0
- google-auth-oauthlib==1.0.0
- google-pasta==0.2.0
- greenlet==2.0.2
- grpcio==1.57.0
- h11==0.14.0
- h5py==3.9.0
- humanfriendly==10.0
- idna==3.4
- imageio==2.31.1
- importlib-metadata==6.0.1
- importlib-resources==6.0.1
- itsdangerous==2.1.2
- jax==0.4.14
- jinja2==3.1.2
- joblib==1.3.2
- keras==2.12.0
- kiwisolver==1.4.4
- lazy-loader==0.3
- libclang==16.0.6
- lit==16.0.6
- locust==2.16.1
- markdown==3.4.4
- markdown-it-py==3.0.0
- markupsafe==2.1.3
- matplotlib==3.7.2
- mdurl==0.1.2
- ml-dtypes==0.2.0
- mpmath==1.3.0
- msgpack==1.0.5
- mss==9.0.1
- multidict==6.0.4
- networkx==3.1
- numpy==1.23.5
- nvidia-cublas-cu11==11.10.3.66
- nvidia-cublas-cu12==12.2.4.5
- nvidia-cuda-cupti-cu11==11.7.101
- nvidia-cuda-nvrtc-cu11==11.7.99
- nvidia-cuda-nvrtc-cu12==12.2.128
- nvidia-cuda-runtime-cu11==11.7.99
- nvidia-cuda-runtime-cu12==12.2.128
- nvidia-cudnn-cu11==8.5.0.96
- nvidia-cudnn-cu12==8.9.4.25
- nvidia-cufft-cu11==10.9.0.58
- nvidia-curand-cu11==10.2.10.91
- nvidia-cusolver-cu11==11.4.0.1
- nvidia-cusparse-cu11==11.7.4.91
- nvidia-nccl-cu11==2.14.3
- nvidia-nvtx-cu11==11.7.91
- nvidia-tensorrt==99.0.0
- oauthlib==3.2.2
- onnx==1.14.0
- onnxruntime==1.15.1
- onnxruntime-gpu==1.15.1
- opencv-python==4.8.0.76
- opencv-python-headless==4.8.0.76
- opentelemetry-api==1.18.0
- opentelemetry-instrumentation==0.39b0
- opentelemetry-instrumentation-aiohttp-client==0.39b0
- opentelemetry-instrumentation-asgi==0.39b0
- opentelemetry-sdk==1.18.0
- opentelemetry-semantic-conventions==0.39b0
- opentelemetry-util-http==0.39b0
- opt-einsum==3.3.0
- packaging==23.1
- pandas==2.0.3
- pathspec==0.11.2
- pillow==10.0.0
- pip-requirements-parser==32.0.1
- pip-tools==7.3.0
- prometheus-client==0.17.1
- protobuf==3.20.3
- psutil==5.9.5
- py-cpuinfo==9.0.0
- pyasn1==0.5.0
- pyasn1-modules==0.3.0
- pydantic==2.2.1
- pydantic-core==2.6.1
- pygments==2.16.1
- pynvml==11.5.0
- pyparsing==3.0.9
- pyproject-hooks==1.0.0
- python-dateutil==2.8.2
- python-json-logger==2.0.7
- python-multipart==0.0.6
- python-rapidjson==1.10
- pytz==2023.3
- pywavelets==1.4.1
- pyyaml==6.0.1
- pyzmq==25.1.1
- qudida==0.0.4
- requests==2.31.0
- requests-oauthlib==1.3.1
- rich==13.5.2
- roundrobin==0.0.4
- rsa==4.9
- schema==0.7.5
- scikit-image==0.21.0
- scikit-learn==1.3.0
- scipy==1.11.2
- seaborn==0.12.2
- simple-di==0.1.5
- six==1.16.0
- sniffio==1.3.0
- starlette==0.28.0
- sympy==1.12
- tensorboard==2.12.3
- tensorboard-data-server==0.7.1
- tensorflow==2.12.0
- tensorflow-estimator==2.12.0
- tensorflow-io==0.33.0
- tensorflow-io-gcs-filesystem==0.33.0
- tensorrt==8.6.1
- tensorrt-bindings==8.6.1
- tensorrt-libs==8.6.1
- termcolor==2.3.0
- tf2onnx==1.15.0
- thop==0.1.1-2209072238
- threadpoolctl==3.2.0
- tifffile==2023.8.12
- tomli==2.0.1
- torch==2.0.1
- torchvision==0.15.2
- tornado==6.3.3
- tqdm==4.66.1
- triton==2.0.0
- tritonclient==2.36.0
- typing-extensions==4.7.1
- tzdata==2023.3
- ultralytics==8.0.158
- urllib3==1.26.16
- uvicorn==0.23.2
- watchfiles==0.19.0
- werkzeug==2.3.7
- wrapt==1.14.1
- yarl==1.9.2
- zipp==3.16.2
- zope-event==5.0
- zope-interface==6.0
prefix: /home/hayden/miniconda3/envs/bento