NanoLLM icon indicating copy to clipboard operation
NanoLLM copied to clipboard

Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.

Results 32 NanoLLM issues
Sort by recently updated
recently updated
newest added

In Live LLaVA, NanoVLM and Nanodb, we use video files as video-input, since we don't have V4L2 USB webcam at the moment. Based on the demo descriptions, seems it supports...

I am trying to use the model llava-v1.6-mistral-7b-hf in the text-generation-webui demo, but I am getting errors. The last few lines of the error message read like: /usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py:1119 in from_pretrained...

Hi @dusty-nv, thanks for this amazing library! We're using it in a cool art project for Burning Man :-) I tested the new llava 1.6 (specifically https://huggingface.co/lmms-lab/llama3-llava-next-8b), and it seems...

When I tried to call : ```python llm = NanoLLM.from_pretrained( model="TinyLlama/TinyLlama-1.1B-Chat-v1.0", api='hf', api_token='mytoken', quantization='q4f16_ft', ) ``` I got: > Traceback (most recent call last): > File "/root/nanollm.py", line 6, in...

Hello all, I hope whoever reads this is doing well!! :) So I'm trying to get this going on my Jetson Nano 8GB. I'm getting stuck (maybe?) at Quantization. I...

Hi @dusty-nv , amazing work. I am the creator of Nous Obsidian 3B and found out that NanoLLM supports it, so thank you very much! I just dropped nanoLLaVA which...

I cloned the repo and created the `venv` virtual environment using `python3.10.12`. Got the following error: ``` (venv) bryan@mimzy-jetson:~/git/NanoLLM$ pip3 install -r requirements.txt Requirement already satisfied: torch in ./venv/lib/python3.10/site-packages (from...

Dear @dusty-nv , I'm trying the example code on web page: [Function Calling](https://dusty-nv.github.io/NanoLLM/chat.html#function-calling). I tried both Llama-2-7b-chat-hf and Meta-Llama-3-8B-Instruct. Seems Llama-2-7b-chat-hf is more reliable than Meta-Llama-3-8B-Instruct. Here are the replies...

Hi, I am using nanollm container on Jetpack 6 rev 1 . By itself it is functioning perfectly fine. But when I tried to use my example/unittest python file inside...

Hello, I’ve been running some tests using the nano_llm.vision.video module with live camera streaming on AGX Orin 64gb model. with the following parameters, --model Efficient-Large-Model/VILA1.5-13b \ --max-images 5 \ --max-new-tokens...