Michael Taron
Michael Taron
It's pretty common to "install" executables on Linux by symlinking them into `$XDG_BIN_DIR`, e.g. `ln --symbolic "$XDG_DATA_HOME/go/bin/go" "$XDG_BIN_DIR/go"` I tried something similar with a dotnet CLI I created using System.CommandLine,...
Hello! I was just trying out llgtrt yesterday and was blown away by how easy it was it get LoRA adapters up and running. I was struggling for over a...
The implementation of the [liveness and readiness health checks](https://github.com/guidance-ai/llgtrt/blob/a74cbb55b2e96db8c1c5f7f73c419dfc06ec0411/llgtrt/src/routes/health_check.rs#L18) are swapped based on my expectations from using [Kubernetes liveness and readiness probes](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/). I expect liveness to mean "are you alive?"...
I would love to see Prometheus metrics from llgtrt; something along the lines of: https://docs.nvidia.com/nim/large-language-models/latest/observability.html Not only does this allow for monitoring and alerting, Prometheus metrics can be used in...
There are there a few small changes that I think would improve the LoRA scenario: 1. Instead of adding the `lora_model` parameter to the request, use the existing `model` parameter....