Vision Core AI

Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.

Step 1: Install Llama C++ and package dependencies on your machine

Clone the Llama C++ repository from GitHub:

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp

On macOS:

Build with make:

make

Or, if you prefer cmake:

cmake --build . --config Release

macOS requirements

you need to install these dependencies in your computer: ffmpeg and portaudio


brew install ffmpeg portaudio

Also be sure to provide permissions to the terminal in the Security & Privacy > Privacy options

Step 2: Download the Model!

Download from Hugging Face - mys/ggml_bakllava-1 this 2 files:

ggml-model-q4_k.gguf (or any other quantized model) - only one is required!
mmproj-model-f16.gguf

Copy the paths of those 2 files.

Run this in the llama.cpp repository (replace YOUR_PATH with the paths to the files you downloaded):

macOS

./server -m YOUR_PATH/ggml-model-q4_k.gguf --mmproj YOUR_PATH/mmproj-model-f16.gguf -ngl 1

Windows

server.exe -m REPLACE_WITH_YOUR_PATH\ggml-model-q4_k.gguf --mmproj REPLACE_WITH_YOUR_PATH\mmproj-model-f16.gguf -ngl 1

The llama server is now up and running!

⚠️ NOTE: Keep the server running in the background.
Let's run the script to use the webcam and microphone

Step 3: Running the Demo

Open a new terminal window and clone the demo app:

git clone https://github.com/herrera-luis/vision-core-ai.git
cd vision-core-ai

Install python dependencies


pip install -r requirements.txt

Run the main script

python main.py

How to interact with the app

When the application is running you need to press the keys i or c to enable the recording and a second time the same key to stop it

i will use your webcam
c will use chat

vision-core-ai
vision-core-ai copied to clipboard

Metadata

Vision Core AI

Step 1: Install Llama C++ and package dependencies on your machine

On macOS:

macOS requirements

Step 2: Download the Model!

macOS

Windows

Step 3: Running the Demo

Install python dependencies

Run the main script

How to interact with the app

Related project:

← Metadata

Owner

Metadata

vision-core-ai vision-core-ai copied to clipboard

Metadata

Vision Core AI

Step 1: Install Llama C++ and package dependencies on your machine

On macOS:

macOS requirements

Step 2: Download the Model!

macOS

Windows

Step 3: Running the Demo

Install python dependencies

Run the main script

How to interact with the app

Related project:

← Metadata

Owner

Metadata

vision-core-ai
vision-core-ai copied to clipboard