automaticcat

Results 98 comments of automaticcat

Link to excalid draw: https://excalidraw.com/#json=kOBPg9OoLTCLAm3JO7FHn,qV29wMh7fLvGkFXf5HRYNA

Stale, rn we're doing this in cortex with different ticket

This is not a bug since i don't have any hard coded token limit

blocked by Python runtime

This is a purely performance issue. We can try to mitigate this with some warning @imtuyethan

This should be on nitro inference plugin level

need to resolve the difference in ggml model + file between whisper and llama cpp

hi @Elsayed91 i have a working CUDA example dockerfile in my homelab, i will update that for everyone to try also