trt-llm-rag-windows icon indicating copy to clipboard operation
trt-llm-rag-windows copied to clipboard

LInux

Open nmandic78 opened this issue 1 year ago • 9 comments

Why forcing Windows only?

nmandic78 avatar Feb 13 '24 17:02 nmandic78

It will be définitively pertinent to have a version compatible with Ubuntu / Linux since there are scientificité work station with Linux on top.

flefevre avatar Feb 13 '24 20:02 flefevre

I believe that the majority of users interested in a similar aspect are Linux users, and it's indeed disheartening to note that it's released without API and support for Linux OS. I hope they provide a Linux version soon.

IfrKonv avatar Feb 15 '24 11:02 IfrKonv

All of this repo is python code, I have already run it on Ubuntu 22.04, and you just only need to fix some dependencies about it manually.

noahc1510 avatar Feb 18 '24 19:02 noahc1510

Hi @noahc1510, can you describe the process you used to fix the dependencies?

Would be nice to add what you did to the readme so people can use it on linux.

cdelv avatar Feb 20 '24 23:02 cdelv

@noahc1510 seconded. I would consider automating it and making a pr to add it. Basically throwing it against the wall, seeing if it sticks.

REALERvolker1 avatar Feb 21 '24 07:02 REALERvolker1

Hi @noahc1510, can you describe the process you used to fix the dependencies?

Would be nice to add what you did to the readme so people can use it on linux.

You can follow this guide below to fix the requirement. I tested it on the official installer and created a repo about it: https://github.com/noahc1510/trt-llm-rag-linux

System Requirement

  • Nvidia Driver: sudo apt install nvidia-driver-535
  • CUDA: sudo apt install nvidia-cuda-toolkit
  • NCCL:
    wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.0-1_all.deb
    sudo dpkg -i cuda-keyring_1.0-1_all.deb
    sudo apt-get update
    sudo apt install libnccl2
    
  • libmpi: sudo apt install libopenmpi-dev

Installation

  1. Install miniconda, create new environment and install pytorch=2.1.0, mpi4py=3.1.5, tensorrt-llm
    conda create -n trtllm python=3.10
    conda activate trtllm
    conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
    conda install -c conda-forge mpi4py mpich
    pip install --no-cache-dir --extra-index-url https://pypi.nvidia.com tensorrt-llm
    
    In China, you can use these command below without vpn:
    conda create -n trtllm python=3.10
    conda activate trtllm
    conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
    conda install -c conda-forge mpi4py mpich
    pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --extra-index-url https://pypi.nvidia.com tensorrt-llm
    
  2. Install the requirements
    pip install -r requirements.txt
    

noahc1510 avatar Feb 27 '24 04:02 noahc1510

There are two points on the readme that border on insulting.

Firstly, it's aimed solely at Windows, when the vast majority of servers used to train ML models are on Linux. That's a bit much for a community that's been fighting for 30 years to explain why we use this OS.

And it's also getting very irritating to see a perpetual amalgam between Ubuntu, which is a distribution, and not the best at all, and GNU Linux.

It's time for a change of mentality once and for all. Thank you, therefore, for making sure that these models and this kind of project are initially offered on GNU Linux in general.

Sorry for the irritation, but I've been in the business for 22 years and it's starting to get a bit desperate.

metal3d avatar Mar 17 '24 09:03 metal3d

hi @noahc1510 . I have a ubuntu 23.10 running on RTX 4090, I've followed your instructions, after pip install -r requirements.txt I downloaded the engine and tokenizer files. Now I'm getting a different error (trtllm) vgtu@vgtu-Default-string:~/Tomo/trt-llm-rag-windows$ python app.py --trt_engine_path model/ --trt_engine_name llama_float16_tp1_rank0.engine --tokenizer_dir_path model/ --data_dir dataset/ Traceback (most recent call last): File "/home/vgtu/Tomo/trt-llm-rag-windows/app.py", line 26, in <module> from trt_llama_api import TrtLlmAPI #llama_index does not currently support TRT-LLM. The trt_llama_api.py file defines a llama_index compatible interface for TRT-LLM. File "/home/vgtu/Tomo/trt-llm-rag-windows/trt_llama_api.py", line 24, in <module> from llama_index.bridge.pydantic import Field, PrivateAttr ModuleNotFoundError: No module named 'llama_index.bridge' at first it was only missing llama_index, installed it using pip install llama_index, and now llama_index.bridge is missing... Also tried using conda install llama_index, it installed, but this same error keeps showing. Can you please help? Thank you very much

ninono12345 avatar Apr 18 '24 12:04 ninono12345

yeah... I just found out that I had to change llama_index.bridge to llama_index.legacy.bridge , same for some others

ninono12345 avatar Apr 20 '24 13:04 ninono12345

Current application is only targeted for windows

anujj avatar May 23 '24 09:05 anujj

https://chat-withrtx.com/linux/

shanness avatar Jul 01 '24 10:07 shanness

chat-withrtx.com/linux

is this site legit? seems fake

payrim avatar Jul 09 '24 11:07 payrim

Anyway, I now decided to purge everything I can that is related to Nvidia.

I now use llama-cpp (and the python backend), complied with Vulkan backend instead of CUDA.

I can chat with documents using llama-index or my own ChromaDB + prompt. I can use whatever the model I want, it can also split it through several GPU + CPU if needed. It's easy to use, easy to develop, and I have the control.

Everything is opensource and targeted to Linux.

Very disappointed how Nvidia fails to thanks the Linux community. The driver in still a problem, CUDA is closed source and I'm tired of this direction.

Sorry guys, maybe one day you will follow our philosophy.

metal3d avatar Jul 11 '24 20:07 metal3d