TensorRT
TensorRT copied to clipboard
🐛 [Bug] failed to install torch-tensorrt
Bug Description
Error Message:
09T18:21:42.631Z INFO: pip is looking at multiple versions of torch-tensorrt to determine which version is compatible with other requirements. This could take a while.
2024-05-09T18:21:42.882Z Collecting torch-tensorrt (from -r /opt/ml/model/code/requirements.txt (line 6)) Using cached torch_tensorrt-1.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB) Using cached torch-tensorrt-0.0.0.post1.tar.gz (9.0 kB) Preparing metadata (setup.py): started Preparing metadata (setup.py): finished with status 'error' error: subprocess-exited-with-error × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [13 lines of output] Traceback (most recent call last): File "
Logs:
2024-05-09T17:52:56.655Z Sagemaker TS environment variables have been set and will be used for single model endpoint.
2024-05-09T17:52:56.655Z Collecting sagemaker-inference==1.10.1 (from -r /opt/ml/model/code/requirements.txt (line 1)) Downloading sagemaker_inference-1.10.1.tar.gz (23 kB) Preparing metadata (setup.py): started Preparing metadata (setup.py): finished with status 'done'
2024-05-09T17:52:56.808Z Collecting setfit==1.0.1 (from -r /opt/ml/model/code/requirements.txt (line 2)) Downloading setfit-1.0.1-py3-none-any.whl.metadata (11 kB)
2024-05-09T17:52:56.808Z Collecting transformers==4.37.2 (from -r /opt/ml/model/code/requirements.txt (line 3)) Downloading transformers-4.37.2-py3-none-any.whl.metadata (129 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 129.4/129.4 kB 9.3 MB/s eta 0:00:00
2024-05-09T17:52:56.808Z Requirement already satisfied: torch==2.1.0 in /opt/conda/lib/python3.10/site-packages (from -r /opt/ml/model/code/requirements.txt (line 4)) (2.1.0+cu118)
2024-05-09T17:52:57.059Z Collecting optimum (from -r /opt/ml/model/code/requirements.txt (line 5)) Downloading optimum-1.19.2-py3-none-any.whl.metadata (19 kB)
2024-05-09T17:52:57.059Z Collecting torch-tensorrt (from -r /opt/ml/model/code/requirements.txt (line 6)) Downloading torch_tensorrt-1.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB)
2024-05-09T17:52:57.059Z Requirement already satisfied: boto3 in /opt/conda/lib/python3.10/site-packages (from sagemaker-inference==1.10.1->-r /opt/ml/model/code/requirements.txt (line 1)) (1.28.60)
2024-05-09T17:52:57.059Z Requirement already satisfied: numpy in /opt/conda/lib/python3.10/site-packages (from sagemaker-inference==1.10.1->-r /opt/ml/model/code/requirements.txt (line 1)) (1.24.4)
2024-05-09T17:52:57.059Z Requirement already satisfied: six in /opt/conda/lib/python3.10/site-packages (from sagemaker-inference==1.10.1->-r /opt/ml/model/code/requirements.txt (line 1)) (1.16.0)
2024-05-09T17:52:57.059Z Requirement already satisfied: psutil in /opt/conda/lib/python3.10/site-packages (from sagemaker-inference==1.10.1->-r /opt/ml/model/code/requirements.txt (line 1)) (5.9.5)
2024-05-09T17:52:57.059Z Requirement already satisfied: retrying<1.4,>=1.3.3 in /opt/conda/lib/python3.10/site-packages (from sagemaker-inference==1.10.1->-r /opt/ml/model/code/requirements.txt (line 1)) (1.3.4)
2024-05-09T17:52:57.059Z Requirement already satisfied: scipy in /opt/conda/lib/python3.10/site-packages (from sagemaker-inference==1.10.1->-r /opt/ml/model/code/requirements.txt (line 1)) (1.10.1)
2024-05-09T17:52:57.059Z Collecting datasets>=2.3.0 (from setfit==1.0.1->-r /opt/ml/model/code/requirements.txt (line 2)) Downloading datasets-2.19.1-py3-none-any.whl.metadata (19 kB)
2024-05-09T17:52:57.059Z Collecting sentence-transformers>=2.2.1 (from setfit==1.0.1->-r /opt/ml/model/code/requirements.txt (line 2)) Downloading sentence_transformers-2.7.0-py3-none-any.whl.metadata (11 kB)
2024-05-09T17:52:57.310Z Collecting evaluate>=0.3.0 (from setfit==1.0.1->-r /opt/ml/model/code/requirements.txt (line 2)) Downloading evaluate-0.4.2-py3-none-any.whl.metadata (9.3 kB)
2024-05-09T17:52:57.310Z Collecting huggingface-hub>=0.13.0 (from setfit==1.0.1->-r /opt/ml/model/code/requirements.txt (line 2)) Downloading huggingface_hub-0.23.0-py3-none-any.whl.metadata (12 kB)
2024-05-09T17:52:57.560Z Requirement already satisfied: scikit-learn in /opt/conda/lib/python3.10/site-packages (from setfit==1.0.1->-r /opt/ml/model/code/requirements.txt (line 2)) (1.1.3)
2024-05-09T17:52:57.560Z Requirement already satisfied: filelock in /opt/conda/lib/python3.10/site-packages (from transformers==4.37.2->-r /opt/ml/model/code/requirements.txt (line 3)) (3.13.1)
2024-05-09T17:52:57.560Z Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.10/site-packages (from transformers==4.37.2->-r /opt/ml/model/code/requirements.txt (line 3)) (23.1)
2024-05-09T17:52:58.061Z Requirement already satisfied: pyyaml>=5.1 in /opt/conda/lib/python3.10/site-packages (from transformers==4.37.2->-r /opt/ml/model/code/requirements.txt (line 3)) (6.0)
2024-05-09T17:52:58.061Z Collecting regex!=2019.12.17 (from transformers==4.37.2->-r /opt/ml/model/code/requirements.txt (line 3)) Downloading regex-2024.4.28-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (40 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 40.8/40.8 kB 17.1 MB/s eta 0:00:00
2024-05-09T17:52:58.311Z Requirement already satisfied: requests in /opt/conda/lib/python3.10/site-packages (from transformers==4.37.2->-r /opt/ml/model/code/requirements.txt (line 3)) (2.31.0)
2024-05-09T17:52:58.562Z Collecting tokenizers<0.19,>=0.14 (from transformers==4.37.2->-r /opt/ml/model/code/requirements.txt (line 3)) Downloading tokenizers-0.15.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
2024-05-09T17:52:58.562Z Collecting safetensors>=0.4.1 (from transformers==4.37.2->-r /opt/ml/model/code/requirements.txt (line 3)) Downloading safetensors-0.4.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.8 kB)
2024-05-09T17:52:58.562Z Requirement already satisfied: tqdm>=4.27 in /opt/conda/lib/python3.10/site-packages (from transformers==4.37.2->-r /opt/ml/model/code/requirements.txt (line 3)) (4.66.4)
2024-05-09T17:52:58.562Z Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.10/site-packages (from torch==2.1.0->-r /opt/ml/model/code/requirements.txt (line 4)) (4.9.0)
2024-05-09T17:52:58.563Z Requirement already satisfied: sympy in /opt/conda/lib/python3.10/site-packages (from torch==2.1.0->-r /opt/ml/model/code/requirements.txt (line 4)) (1.12)
2024-05-09T17:52:58.563Z Requirement already satisfied: networkx in /opt/conda/lib/python3.10/site-packages (from torch==2.1.0->-r /opt/ml/model/code/requirements.txt (line 4)) (3.2.1)
2024-05-09T17:52:58.563Z Requirement already satisfied: jinja2 in /opt/conda/lib/python3.10/site-packages (from torch==2.1.0->-r /opt/ml/model/code/requirements.txt (line 4)) (3.1.4)
2024-05-09T17:52:58.563Z Requirement already satisfied: fsspec in /opt/conda/lib/python3.10/site-packages (from torch==2.1.0->-r /opt/ml/model/code/requirements.txt (line 4)) (2023.12.2)
2024-05-09T17:52:58.814Z Collecting coloredlogs (from optimum->-r /opt/ml/model/code/requirements.txt (line 5)) Downloading coloredlogs-15.0.1-py2.py3-none-any.whl.metadata (12 kB)
2024-05-09T17:52:58.814Z INFO: pip is looking at multiple versions of torch-tensorrt to determine which version is compatible with other requirements. This could take a while.
2024-05-09T17:52:59.065Z Collecting torch-tensorrt (from -r /opt/ml/model/code/requirements.txt (line 6)) Downloading torch_tensorrt-1.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB) Downloading torch-tensorrt-0.0.0.post1.tar.gz (9.0 kB) Preparing metadata (setup.py): started Preparing metadata (setup.py): finished with status 'error' error: subprocess-exited-with-error × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [13 lines of output] Traceback (most recent call last): File "<string>", line 2, in <module> File "<pip-setuptools-caller>", line 34, in <module> File "/home/model-server/tmp/pip-install-ndpb_izf/torch-tensorrt_1eaee9fc2794472ca9b57c4ba02da88f/setup.py", line 125, in <module> raise RuntimeError(open("ERROR.txt", "r").read()) RuntimeError: ########################################################################################### The package you are trying to install is only a placeholder project on PyPI.org repository. To install Torch-TensorRT please run the following command: $ pip install torch-tensorrt -f https://github.com/NVIDIA/Torch-TensorRT/releases ########################################################################################### [end of output] note: This error originates from a subprocess, and is likely not a problem with pip.
2024-05-09T17:52:59.065Z error: metadata-generation-failed
2024-05-09T17:52:59.065Z × Encountered error while generating package metadata.
2024-05-09T17:52:59.065Z ╰─> See above for output.
2024-05-09T17:52:59.065Z note: This is an issue with the package mentioned above, not pip.
2024-05-09T17:52:59.316Z hint: See above for details.
2024-05-09T17:52:59.316Z 2024-05-09 17:52:59,107 - sagemaker-inference - ERROR - failed to install required packages, exiting
2024-05-09T17:52:59.316Z Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/sagemaker_inference/model_server.py", line 41, in _install_requirements subprocess.check_call(pip_install_cmd) File "/opt/conda/lib/python3.10/subprocess.py", line 369, in check_call raise CalledProcessError(retcode, cmd)
2024-05-09T17:52:59.316Z subprocess.CalledProcessError: Command '['/opt/conda/bin/python', '-m', 'pip', 'install', '-r', '/opt/ml/model/code/requirements.txt']' returned non-zero exit status 1.
2024-05-09T17:52:59.316Z During handling of the above exception, another exception occurred:
2024-05-09T17:52:59.316Z Traceback (most recent call last): File "/usr/local/bin/dockerd-entrypoint.py", line 23, in <module> serving.main() File "/opt/conda/lib/python3.10/site-packages/sagemaker_pytorch_serving_container/serving.py", line 38, in main _start_torchserve() File "/opt/conda/lib/python3.10/site-packages/retrying.py", line 56, in wrapped_f return Retrying(*dargs, **dkw).call(f, *args, **kw) File "/opt/conda/lib/python3.10/site-packages/retrying.py", line 257, in call return attempt.get(self._wrap_exception) File "/opt/conda/lib/python3.10/site-packages/retrying.py", line 301, in get six.reraise(self.value[0], self.value[1], self.value[2]) File "/opt/conda/lib/python3.10/site-packages/six.py", line 719, in reraise raise value File "/opt/conda/lib/python3.10/site-packages/retrying.py", line 251, in call attempt = Attempt(fn(*args, **kwargs), attempt_number, False) File "/opt/conda/lib/python3.10/site-packages/sagemaker_pytorch_serving_container/serving.py", line 34, in _start_torchserve torchserve.start_torchserve(handler_service=HANDLER_SERVICE) File "/opt/conda/lib/python3.10/site-packages/sagemaker_pytorch_serving_container/torchserve.py", line 79, in start_torchserve model_server._install_requirements() File "/opt/conda/lib/python3.10/site-packages/sagemaker_inference/model_server.py", line 44, in _install_requirements raise ValueError("failed to install required packages")
2024-05-09T17:53:01.977Z ValueError: failed to install required packages
2024-05-09T17:53:02.072Z Sagemaker TS environment variables have been set and will be used for single model endpoint.
2024-05-09T17:53:02.573Z Collecting sagemaker-inference==1.10.1 (from -r /opt/ml/model/code/requirements.txt (line 1)) Using cached sagemaker_inference-1.10.1.tar.gz (23 kB) Preparing metadata (setup.py): started Preparing metadata (setup.py): finished with status 'done'
2024-05-09T17:53:02.573Z Collecting setfit==1.0.1 (from -r /opt/ml/model/code/requirements.txt (line 2)) Using cached setfit-1.0.1-py3-none-any.whl.metadata (11 kB)
2024-05-09T17:53:02.573Z Collecting transformers==4.37.2 (from -r /opt/ml/model/code/requirements.txt (line 3)) Using cached transformers-4.37.2-py3-none-any.whl.metadata (129 kB)
2024-05-09T17:53:02.573Z Requirement already satisfied: torch==2.1.0 in /opt/conda/lib/python3.10/site-packages (from -r /opt/ml/model/code/requirements.txt (line 4)) (2.1.0+cu118)
2024-05-09T17:53:02.573Z Collecting optimum (from -r /opt/ml/model/code/requirements.txt (line 5)) Using cached optimum-1.19.2-py3-none-any.whl.metadata (19 kB)
2024-05-09T17:53:02.573Z Collecting torch-tensorrt (from -r /opt/ml/model/code/requirements.txt (line 6)) Using cached torch_tensorrt-1.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB)
2024-05-09T17:53:02.573Z Requirement already satisfied: boto3 in /opt/conda/lib/python3.10/site-packages (from sagemaker-inference==1.10.1->-r /opt/ml/model/code/requirements.txt (line 1)) (1.28.60)
2024-05-09T17:53:02.573Z Requirement already satisfied: numpy in /opt/conda/lib/python3.10/site-packages (from sagemaker-inference==1.10.1->-r /opt/ml/model/code/requirements.txt (line 1)) (1.24.4)
2024-05-09T17:53:02.573Z Requirement already satisfied: six in /opt/conda/lib/python3.10/site-packages (from sagemaker-inference==1.10.1->-r /opt/ml/model/code/requirements.txt (line 1)) (1.16.0)
2024-05-09T17:53:02.573Z Requirement already satisfied: psutil in /opt/conda/lib/python3.10/site-packages (from sagemaker-inference==1.10.1->-r /opt/ml/model/code/requirements.txt (line 1)) (5.9.5)
2024-05-09T17:53:02.573Z Requirement already satisfied: retrying<1.4,>=1.3.3 in /opt/conda/lib/python3.10/site-packages (from sagemaker-inference==1.10.1->-r /opt/ml/model/code/requirements.txt (line 1)) (1.3.4)
2024-05-09T17:53:02.824Z Requirement already satisfied: scipy in /opt/conda/lib/python3.10/site-packages (from sagemaker-inference==1.10.1->-r /opt/ml/model/code/requirements.txt (line 1)) (1.10.1)
2024-05-09T17:53:02.824Z Collecting datasets>=2.3.0 (from setfit==1.0.1->-r /opt/ml/model/code/requirements.txt (line 2)) Using cached datasets-2.19.1-py3-none-any.whl.metadata (19 kB)
2024-05-09T17:53:02.824Z Collecting sentence-transformers>=2.2.1 (from setfit==1.0.1->-r /opt/ml/model/code/requirements.txt (line 2)) Using cached sentence_transformers-2.7.0-py3-none-any.whl.metadata (11 kB)
2024-05-09T17:53:02.824Z Collecting evaluate>=0.3.0 (from setfit==1.0.1->-r /opt/ml/model/code/requirements.txt (line 2)) Using cached evaluate-0.4.2-py3-none-any.whl.metadata (9.3 kB)
2024-05-09T17:53:02.824Z Collecting huggingface-hub>=0.13.0 (from setfit==1.0.1->-r /opt/ml/model/code/requirements.txt (line 2)) Using cached huggingface_hub-0.23.0-py3-none-any.whl.metadata (12 kB)
2024-05-09T17:53:03.326Z Requirement already satisfied: scikit-learn in /opt/conda/lib/python3.10/site-packages (from setfit==1.0.1->-r /opt/ml/model/code/requirements.txt (line 2)) (1.1.3)
2024-05-09T17:53:03.326Z Requirement already satisfied: filelock in /opt/conda/lib/python3.10/site-packages (from transformers==4.37.2->-r /opt/ml/model/code/requirements.txt (line 3)) (3.13.1)
2024-05-09T17:53:03.326Z Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.10/site-packages (from transformers==4.37.2->-r /opt/ml/model/code/requirements.txt (line 3)) (23.1)
2024-05-09T17:53:03.576Z Requirement already satisfied: pyyaml>=5.1 in /opt/conda/lib/python3.10/site-packages (from transformers==4.37.2->-r /opt/ml/model/code/requirements.txt (line 3)) (6.0)
2024-05-09T17:53:03.576Z Collecting regex!=2019.12.17 (from transformers==4.37.2->-r /opt/ml/model/code/requirements.txt (line 3)) Using cached regex-2024.4.28-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (40 kB)
2024-05-09T17:53:03.826Z Requirement already satisfied: requests in /opt/conda/lib/python3.10/site-packages (from transformers==4.37.2->-r /opt/ml/model/code/requirements.txt (line 3)) (2.31.0)
2024-05-09T17:53:04.077Z Collecting tokenizers<0.19,>=0.14 (from transformers==4.37.2->-r /opt/ml/model/code/requirements.txt (line 3)) Using cached tokenizers-0.15.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
2024-05-09T17:53:04.077Z Collecting safetensors>=0.4.1 (from transformers==4.37.2->-r /opt/ml/model/code/requirements.txt (line 3)) Using cached safetensors-0.4.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.8 kB)
2024-05-09T17:53:04.077Z Requirement already satisfied: tqdm>=4.27 in /opt/conda/lib/python3.10/site-packages (from transformers==4.37.2->-r /opt/ml/model/code/requirements.txt (line 3)) (4.66.4)
2024-05-09T17:53:04.077Z Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.10/site-packages (from torch==2.1.0->-r /opt/ml/model/code/requirements.txt (line 4)) (4.9.0)
2024-05-09T17:53:04.077Z Requirement already satisfied: sympy in /opt/conda/lib/python3.10/site-packages (from torch==2.1.0->-r /opt/ml/model/code/requirements.txt (line 4)) (1.12)
2024-05-09T17:53:04.077Z Requirement already satisfied: networkx in /opt/conda/lib/python3.10/site-packages (from torch==2.1.0->-r /opt/ml/model/code/requirements.txt (line 4)) (3.2.1)
2024-05-09T17:53:04.077Z Requirement already satisfied: jinja2 in /opt/conda/lib/python3.10/site-packages (from torch==2.1.0->-r /opt/ml/model/code/requirements.txt (line 4)) (3.1.4)
2024-05-09T17:53:04.077Z Requirement already satisfied: fsspec in /opt/conda/lib/python3.10/site-packages (from torch==2.1.0->-r /opt/ml/model/code/requirements.txt (line 4)) (2023.12.2)
2024-05-09T17:53:04.328Z Collecting coloredlogs (from optimum->-r /opt/ml/model/code/requirements.txt (line 5)) Using cached coloredlogs-15.0.1-py2.py3-none-any.whl.metadata (12 kB)
2024-05-09T17:53:04.328Z INFO: pip is looking at multiple versions of torch-tensorrt to determine which version is compatible with other requirements. This could take a while.
2024-05-09T17:53:04.578Z Collecting torch-tensorrt (from -r /opt/ml/model/code/requirements.txt (line 6)) Using cached torch_tensorrt-1.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB) Using cached torch-tensorrt-0.0.0.post1.tar.gz (9.0 kB) Preparing metadata (setup.py): started Preparing metadata (setup.py): finished with status 'error' error: subprocess-exited-with-error × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [13 lines of output] Traceback (most recent call last): File "<string>", line 2, in <module> File "<pip-setuptools-caller>", line 34, in <module> File "/home/model-server/tmp/pip-install-ou8dudye/torch-tensorrt_f613c1ea02ee46eba6289ad76ccd02c4/setup.py", line 125, in <module> raise RuntimeError(open("ERROR.txt", "r").read()) RuntimeError: ########################################################################################### The package you are trying to install is only a placeholder project on PyPI.org repository. To install Torch-TensorRT please run the following command: $ pip install torch-tensorrt -f https://github.com/NVIDIA/Torch-TensorRT/releases ########################################################################################### [end of output] note: This error originates from a subprocess, and is likely not a problem with pip.
2024-05-09T17:53:04.578Z error: metadata-generation-failed
2024-05-09T17:53:04.578Z × Encountered error while generating package metadata.
2024-05-09T17:53:04.578Z ╰─> See above for output.
2024-05-09T17:53:04.578Z note: This is an issue with the package mentioned above, not pip.
2024-05-09T17:53:04.578Z hint: See above for details.
2024-05-09T17:53:04.578Z 2024-05-09 17:53:04,566 - sagemaker-inference - ERROR - failed to install required packages, exiting
To Reproduce
Steps to reproduce the behavior:
- use a sagemaker container which uses the following requirements.txt to install it
requirements.txt:
sagemaker-inference==1.10.1
setfit==1.0.1
transformers==4.37.2
torch==2.1.0
optimum
torch-tensorrt
sagemaker image: 763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-inference:2.1-gpu-py310
Expected behavior
no error
Environment
Build information about Torch-TensorRT can be found by turning on debug messages
- aws instance: g4dn.xlarge
- docker image is here: https://github.com/aws/deep-learning-containers/blob/master/pytorch/inference/docker/2.1/py3/cu118/Dockerfile.gpu
- Torch-TensorRT Version (e.g. 1.0.0):
- PyTorch Version (e.g. 1.0):
- CPU Architecture:
- OS (e.g., Linux):
- How you installed PyTorch (
conda,pip,libtorch, source): - Build command you used (if compiling from source):
- Are you using local sources or building from archives:
- Python version:
- CUDA version:
- GPU models and configuration: Nvidia T4
- Any other relevant information:
Additional context
i also tried with cuda 11.8 + nvidia driver 470.182.03 and pytorch 2.2.0, pytorch-tensorrt 2.2.0 - same error... is that expected?
Where did you install torch-tensorrt from? Did you install the CUDA 11.8 version (available on http://download.pytorch.org/whl/cu118) or the CUDA 12 version (the one available on PyPI)?