Nick Stogner
Nick Stogner
Note: you will need to restart (`kubectl delete pod `) the KubeAI and models Pods for the changes to take effect after deploying the config updates. We are currently working...
> which kueue labels concretely, just queue-name or some other too? This entails a separate script I guess which iterates over all jobs and patches them, right? > How we...
> Could you confirm this? Yes, I will test this now.
@alculquicondor A quick manual test appears to confirm the behavior of the Eviction condition being added but not acted upon (and resources not freed up for the preemptor job to...
Thanks for the suggestions @meetzuber!
First of all, sorry about the delayed response, we had a lapse in Issue support. @nestoras KubeAI does not propagate Model labels to the Pods that are created to serve...
Hey Kai, as we discussed over chat, vLLM is typically the go-to for serving concurrent production traffic. Does that work for you, or is ollama caching still important for you?
That makes sense. We currently have 2 high priority features that we are focusing on: #132 and #266 ... We can probably fit this feature in after those.
Thanks for filing the issue, will take a look soon!
After reviewing this, I think we will wait for this to be fixed in the Ollama project since we are just providing this script as an example. Feel free to...