LLMinator
LLMinator copied to clipboard
Gradio based tool to run opensource LLM models directly from Huggingface
LLMinator: Run & Test LLMs directly from HuggingFace
Gradio based tool with integrated chatbot to locally run & test LLMs directly from HuggingFace.
An easy-to-use tool made with Gradio, LangChain, and Torch.
⚡ Features
- Context-aware Streaming Chatbot.
- Inbuilt code syntax highlighting.
- Load any LLM repo directly from HuggingFace.
- Supports both CPU & CUDA modes.
- Enable LLM inference with llama.cpp using llama-cpp-python
- Convert models(Safetensors, pt to gguf etc)
- Customize LLM inference parameters(n_gpu_layers, temperature, max_tokens etc)
- Real-time text generation via websockets, enabling seamless integration with different frontend frameworks.
🚀 Installation
To use LLMinator, follow these simple steps:
Clone the LLMinator repository from GitHub & install requirements
```
git clone https://github.com/Aesthisia/LLMinator.git
cd LLMinator
pip install -r requirements.txt
```
Build LLMinator with llama.cpp:
-
Using
make
:-
On Linux or MacOS:
make
-
On Windows:
- Download the latest fortran version of w64devkit.
- Extract
w64devkit
on your pc. - Run
w64devkit.exe
. - Use the
cd
command to reach theLLMinator
folder. - From here you can run:
make
-
-
Using
CMake
:mkdir build cd build cmake ..
Launch LLMinator on browser
- Run the LLMinator tool using the command
python webui.py
. - Access the web interface by opening the http://127.0.0.1:7860 in your browser.
- Start interacting with the chatbot and experimenting with LLMs!
Checkout this youtube video to follow installation steps
Command line arguments
Argument Command | Default | Description |
---|---|---|
--host | 127.0.0.1 | Host or IP address on which the server will listen for incoming connections |
--port | 7860 | Launch gradio with given server port |
--share | False | This generates a public shareable link that you can send to anybody |
Connect to WebSocket for generation
Connect to ws://localhost:7861/ for real-time text generation. Submit prompts and receive responses through the websocket connection.
Integration with Frontends:
The provided example/index.html
demonstrates basic usage of text generation through websocket connection. You can integrate it with any frontend framework like React.js
Installation and Development Tips
Python Version
-
Compatible Versions: This project is compatible with Python versions 3.8+ to 3.11. Ensure you have one of these versions installed on your system. You can check your Python version by running
python --version
orpython3 --version
in your terminal.
Cmake and C Compiler
- Cmake Dependency: If you plan to build the project using Cmake, make sure you have Cmake installed.
-
C Compiler: Additionally, you'll need a C compiler such as GCC. These are typically included with most Linux distributions. You can check this by running
gcc --version
in your terminal. Installation instructions for your specific operating system can be found online.
Visual Studio Code
- Visual Studio Installer: If you're using Visual Studio Code for development, you'll need the C++ development workload installed. You can achieve this through the Visual Studio Installer
GPU Acceleration (CUDA):
- CUDA Installation: To leverage GPU acceleration, you'll need CUDA installed on your system. Download instructions are available on the NVIDIA website.
-
Torch Compatibility: After installing CUDA, confirm CUDA availability with
torch.cuda.is_available()
. When using a GPU, ensure you follow the project's specificllama-cpp-python
installation configuration for CUDA support.
Reporting Issues:
If you encounter any errors or issues, feel free to file a detailed report in the project's repository. We're always happy to help! When reporting an issue, please provide as much information as possible, including the error message, logs, the steps you took, and your system configuration. This makes it easier for us to diagnose and fix the problem quickly.
🤝 Contributions
We welcome contributions from the community to enhance LLMinator further. If you'd like to contribute, please follow these guidelines:
- Fork the LLMinator repository on GitHub.
- Create a new branch for your feature or bug fix.
- Test your changes thoroughly.
- Submit a pull request, providing a clear description of the changes you've made.
Reach out to us: [email protected]