automaticcat
automaticcat
We will have 2 main modes: LOCAL and REMOTE LOCAL REMOTE
Link to excalid draw: https://excalidraw.com/#json=kOBPg9OoLTCLAm3JO7FHn,qV29wMh7fLvGkFXf5HRYNA
Stale, rn we're doing this in cortex with different ticket
This is not a bug since i don't have any hard coded token limit
should be transfered to jan
blocked by Python runtime
This is a purely performance issue. We can try to mitigate this with some warning @imtuyethan
This should be on nitro inference plugin level
need to resolve the difference in ggml model + file between whisper and llama cpp
hi @Elsayed91 i have a working CUDA example dockerfile in my homelab, i will update that for everyone to try also