NanoLLM
NanoLLM copied to clipboard
Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.
In Live LLaVA, NanoVLM and Nanodb, we use video files as video-input, since we don't have V4L2 USB webcam at the moment. Based on the demo descriptions, seems it supports...
I am trying to use the model llava-v1.6-mistral-7b-hf in the text-generation-webui demo, but I am getting errors. The last few lines of the error message read like: /usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py:1119 in from_pretrained...
Hi @dusty-nv, thanks for this amazing library! We're using it in a cool art project for Burning Man :-) I tested the new llava 1.6 (specifically https://huggingface.co/lmms-lab/llama3-llava-next-8b), and it seems...
When I tried to call : ```python llm = NanoLLM.from_pretrained( model="TinyLlama/TinyLlama-1.1B-Chat-v1.0", api='hf', api_token='mytoken', quantization='q4f16_ft', ) ``` I got: > Traceback (most recent call last): > File "/root/nanollm.py", line 6, in...
Hello all, I hope whoever reads this is doing well!! :) So I'm trying to get this going on my Jetson Nano 8GB. I'm getting stuck (maybe?) at Quantization. I...
Hi @dusty-nv , amazing work. I am the creator of Nous Obsidian 3B and found out that NanoLLM supports it, so thank you very much! I just dropped nanoLLaVA which...
I cloned the repo and created the `venv` virtual environment using `python3.10.12`. Got the following error: ``` (venv) bryan@mimzy-jetson:~/git/NanoLLM$ pip3 install -r requirements.txt Requirement already satisfied: torch in ./venv/lib/python3.10/site-packages (from...
Dear @dusty-nv , I'm trying the example code on web page: [Function Calling](https://dusty-nv.github.io/NanoLLM/chat.html#function-calling). I tried both Llama-2-7b-chat-hf and Meta-Llama-3-8B-Instruct. Seems Llama-2-7b-chat-hf is more reliable than Meta-Llama-3-8B-Instruct. Here are the replies...
Hi, I am using nanollm container on Jetpack 6 rev 1 . By itself it is functioning perfectly fine. But when I tried to use my example/unittest python file inside...
Hello, I’ve been running some tests using the nano_llm.vision.video module with live camera streaming on AGX Orin 64gb model. with the following parameters, --model Efficient-Large-Model/VILA1.5-13b \ --max-images 5 \ --max-new-tokens...