Sam Stoelinga

Results 223 comments of Sam Stoelinga

Is the concern that people may override the models from the catalog that are managed by Helm? As a result a subsequent helm apply might revert the changes made through...

More feedback: "dashboard for autoscaling metrics"

@sjoerdvandenbos-prodrive we've switched to upstream helm chart of Open WebUI: https://github.com/substratusai/kubeai/pull/379 We plan to do a new release soon of the KubeAI helm chart that will use the upstream helm...

> Can you explain more on the following? What is an init? I was referring to jax distributed initialize which we want to start at roughly the same time on...

Checkpoint saving can be extremely fast with emergency checkpointing. I'm afraid not saving checkpoints may be worse. Especially since we have a custom method deployed to delete lingering pods. For...

@Ethanlm merged latest main. Do we still want to get this merged?

It seems the newer version of tinyllama does have a chat template in the tokenizer_config.json: https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0/blob/main/tokenizer_config.json#L29 however the v0.3 version is missing the chat template: https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v0.3/blob/main/tokenizer_config.json The easiest approach for...

Summary: It seems KubeAI doesn't always correctly update the adapters for an endpoint. Restarting KubeAI is able to workaround the issue. Next step, figure out why endpoints aren't being updated...

Yes it was there already before restart