Sam Stoelinga
Sam Stoelinga
Currently it's hardcoded that a service name should match deployment name https://github.com/substratusai/lingo/blob/8c381444dfd2a957964f471ae9cb528cb5b666fd/pkg/endpoints/manager.go#L93 See relevant code in endpoints/manager.go: ``` const serviceNameLabel = "kubernetes.io/service-name" serviceName, ok := slice.GetLabels()[serviceNameLabel] if !ok { log.Printf("no...
just an idea on how to implement https://github.com/substratusai/lingo/issues/59
Currently this will be seen in the logs Kubernetes service which has no deployment: ``` 2024/02/10 19:51:42 Average for deployment: kubernetes: 0 (ceil: 0), current wait count: 0 2024/02/10 19:51:42...
``` make test go test -mod=readonly -race ./pkg/... ? github.com/substratusai/lingo/pkg/autoscaler [no test files] ? github.com/substratusai/lingo/pkg/leader [no test files] ? github.com/substratusai/lingo/pkg/movingaverage [no test files] ? github.com/substratusai/lingo/pkg/stats [no test files] ok github.com/substratusai/lingo/pkg/deployments...
Use case: I've fine tuned a model in a similar model as here: https://colab.research.google.com/drive/1jCkpikz0J2o20FBQmYmAGdiKmJGOMo-o?usp=sharing#scrollTo=hsD1VKqeA62Z I now have a base model and an adapter. How do I let Basaran load the...
This is helpful when you're using a model to generate a dataset
Right now the following message is seen after waking up from suspend: ``` sub run -f eval-withretrieval-schemasplit-train-80-mistral.yaml . Error: http2: server sent GOAWAY and closed the connection; LastStreamID=2637, ErrCode=NO_ERROR, debug=""...
Steps to reproduce: 1. Create a model with following name: `wgqlg-withretrieval-schemasplit-train-80-mistral-instruct` Current result: No job gets created and following log is observed: ``` 2023-10-15T03:59:28Z ERROR Reconciler error {"controller": "model", "controllerGroup":...
Create a server with name like this `mistral-7b-v0.1` and the following error will be thrown: ``` 2023-10-12T05:28:19Z ERROR Reconciler error {"controller": "server", "controllerGroup": "substratus.ai", "controlle rKind": "Server", "Server": {"name":"mistral-7b-v0.1","namespace":"default"}, "namespace":...
``` Warning Evicted 60s kubelet Pod ephemeral local storage usage exceeds the total limit of containers 100Gi. ``` This can be frustating because in many cases the node had more...