trt-llm-rag-windows
trt-llm-rag-windows copied to clipboard
LInux
Why forcing Windows only?
It will be définitively pertinent to have a version compatible with Ubuntu / Linux since there are scientificité work station with Linux on top.
I believe that the majority of users interested in a similar aspect are Linux users, and it's indeed disheartening to note that it's released without API and support for Linux OS. I hope they provide a Linux version soon.
All of this repo is python code, I have already run it on Ubuntu 22.04, and you just only need to fix some dependencies about it manually.
Hi @noahc1510, can you describe the process you used to fix the dependencies?
Would be nice to add what you did to the readme so people can use it on linux.
@noahc1510 seconded. I would consider automating it and making a pr to add it. Basically throwing it against the wall, seeing if it sticks.
Hi @noahc1510, can you describe the process you used to fix the dependencies?
Would be nice to add what you did to the readme so people can use it on linux.
You can follow this guide below to fix the requirement. I tested it on the official installer and created a repo about it: https://github.com/noahc1510/trt-llm-rag-linux
System Requirement
- Nvidia Driver:
sudo apt install nvidia-driver-535
- CUDA:
sudo apt install nvidia-cuda-toolkit
- NCCL:
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.0-1_all.deb sudo dpkg -i cuda-keyring_1.0-1_all.deb sudo apt-get update sudo apt install libnccl2
- libmpi:
sudo apt install libopenmpi-dev
Installation
- Install miniconda, create new environment and install pytorch=2.1.0, mpi4py=3.1.5, tensorrt-llm
In China, you can use these command below without vpn:conda create -n trtllm python=3.10 conda activate trtllm conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia conda install -c conda-forge mpi4py mpich pip install --no-cache-dir --extra-index-url https://pypi.nvidia.com tensorrt-llm
conda create -n trtllm python=3.10 conda activate trtllm conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia conda install -c conda-forge mpi4py mpich pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --extra-index-url https://pypi.nvidia.com tensorrt-llm
- Install the requirements
pip install -r requirements.txt
There are two points on the readme that border on insulting.
Firstly, it's aimed solely at Windows, when the vast majority of servers used to train ML models are on Linux. That's a bit much for a community that's been fighting for 30 years to explain why we use this OS.
And it's also getting very irritating to see a perpetual amalgam between Ubuntu, which is a distribution, and not the best at all, and GNU Linux.
It's time for a change of mentality once and for all. Thank you, therefore, for making sure that these models and this kind of project are initially offered on GNU Linux in general.
Sorry for the irritation, but I've been in the business for 22 years and it's starting to get a bit desperate.
hi @noahc1510 . I have a ubuntu 23.10 running on RTX 4090, I've followed your instructions, after pip install -r requirements.txt I downloaded the engine and tokenizer files.
Now I'm getting a different error
(trtllm) vgtu@vgtu-Default-string:~/Tomo/trt-llm-rag-windows$ python app.py --trt_engine_path model/ --trt_engine_name llama_float16_tp1_rank0.engine --tokenizer_dir_path model/ --data_dir dataset/ Traceback (most recent call last): File "/home/vgtu/Tomo/trt-llm-rag-windows/app.py", line 26, in <module> from trt_llama_api import TrtLlmAPI #llama_index does not currently support TRT-LLM. The trt_llama_api.py file defines a llama_index compatible interface for TRT-LLM. File "/home/vgtu/Tomo/trt-llm-rag-windows/trt_llama_api.py", line 24, in <module> from llama_index.bridge.pydantic import Field, PrivateAttr ModuleNotFoundError: No module named 'llama_index.bridge'
at first it was only missing llama_index, installed it using pip install llama_index, and now llama_index.bridge is missing... Also tried using conda install llama_index, it installed, but this same error keeps showing.
Can you please help?
Thank you very much
yeah... I just found out that I had to change llama_index.bridge to llama_index.legacy.bridge , same for some others
Current application is only targeted for windows
https://chat-withrtx.com/linux/
Anyway, I now decided to purge everything I can that is related to Nvidia.
I now use llama-cpp (and the python backend), complied with Vulkan backend instead of CUDA.
I can chat with documents using llama-index or my own ChromaDB + prompt. I can use whatever the model I want, it can also split it through several GPU + CPU if needed. It's easy to use, easy to develop, and I have the control.
Everything is opensource and targeted to Linux.
Very disappointed how Nvidia fails to thanks the Linux community. The driver in still a problem, CUDA is closed source and I'm tired of this direction.
Sorry guys, maybe one day you will follow our philosophy.