NanoLLM icon indicating copy to clipboard operation
NanoLLM copied to clipboard

Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.

Results 32 NanoLLM issues
Sort by recently updated
recently updated
newest added

The following commit seems to move several speech related plugins from the audio directory to a new speech directory: Hash: 31ad117619ff12f21f841484fcdf963b1512b159 Message: "updated plugins" Date: 2024/06/12 @ 9:50 PM Trying...

Since the auto prompt with video input generates excessive output, TTS can't keep up, resulting in increasing delays. It would be ideal if we could skip past outputs and use...

In e.g. web_chat.py, you have the following callback: ``` def on_llm_reply(self, text): """ Update the web chat history when the latest LLM response arrives. """ self.send_chat_history() ``` From what I...

Hi, I tried to run this demo in my Jetson. https://www.jetson-ai-lab.com/tutorial_live-llava.html After analyzing a few frames, I keep getting the same results over and over again even though the camera...

Hello, I have a question about the AgentStudio feature: does it have a function to use different tools depending on the input prompt, similar to the Agent feature in LangChain?

Problem: error while running `jetson-containers run $(autotag nano_llm) python3 -m nano_llm.vision.vla --api mlc --model dusty-nv/openvla-7b-mimicgen --quantization q4f16_ft --dataset dusty-nv/bridge_orig_ep100 --dataset-type rlds --max-episodes 10 --save-stats /data/benchmarks/openvla_mimicgen_int4.json ` Message: 18:21:36 | INFO...

Hello @dusty-nv ! I think the nanoLLM project could benefit from offering a server-client architecture, a little like what ollama does with `ollama serve` and the ollama client API. This...