Ettore Di Giacinto
Ettore Di Giacinto
@baditaflorin sounds you are having installation issues with the Nvidia container toolkit. Did you followed their docs? https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installing-with-ap
> Still not able to do p2p inferencing even if workers are online `v2.25.0 (07655c0c2e0e5fe2bca86339a12237b69d258636)`   > > server and workers envs > > CONTEXT_SIZE: "512" > THREADS: "4"...
> There you go. let me know if you need anything else > > [localai-server.log](https://github.com/user-attachments/files/18795950/localai-server.log) [localai-worker-1.log](https://github.com/user-attachments/files/18795951/localai-worker-1.log) mmh ok that looks weird: what's the environment? it looks like they can auto-discover...
To clarify here @j4ys0n - are you referring to unload models from a group of federated workers right? or are you referring to llama.cpp workers? JFYI we have `/backend/shutdown` for...
this should be fixed by https://github.com/mudler/LocalAI/pull/3789
oh nice! that's cool! maybe we can close this already, or maybe want to keep it open until we have an e2e working example?
This is looking nice, thank you! just few small nits
This is going to be a lot interesting with https://github.com/kairos-io/kairos/issues/3244
https://github.com/kairos-io/kairos/issues/3606
This issue will be covered by #3388