cortex
cortex copied to clipboard
Drop-in, local AI alternative to the OpenAI stack. Multi-engine (llama.cpp, TensorRT-LLM). Powers 👋 Jan
Github -> GitHub
Feature for https://github.com/janhq/nitro/issues/175 - [x] Load multiple models - [ ] Add GET `models` to return models list - [ ] CUDA support for multiple model request at the same...
Is this already possible? If not, could this be a feature? For example: ``` .\nitro 4 127.0.0.1 5000 --ngl 20 ``` on a Windows11 machine with NVIDIA GPU.
I still got this bug. _Originally posted by @lovehunter9 in https://github.com/janhq/nitro/issues/273#issuecomment-1878192726_
**Problem** AVX2 is not available on older gen coreI and a lot of users cannot use Jan app due to this issue **Success Criteria** One more distribution for AVX only
**Describe the bug** My windows machine has 3 GPUs, when I enabled all 3 GPUs, the token speed was slow (6-9/s) and it even not able to load tinyllama 1B....
**Describe the bug** The current wait of dealing with waiting is not very optimal and cause many issues regarding performance FIX: - Need to implement wait using CV properly to...
**Problem** Add docs about using Nitro with Chatgptbox https://github.com/josStorer/chatGPTBox
**Problem** - The current implementation for `chat/completion` with only support for base64 as `image_url.url` makes it hard for using curl to test out quickly. Using something like `file://` makes it...