R0CKSTAR
R0CKSTAR
This could be closed. @lowang-bh
We have some thoughts on network topology aware scheduling as well, `BUT` limited in the same IDC. ## User Story Saying we have a 10,000 GPU cluster and each node...
> Thanks for your kind advice! I'll ask this question in tritonserver repo. > > By the way, in a kubernetes cluster, a Pod(of containers) can only be scheduled to...
Wrapped up `Ollama`/`litellm`/`mods config` together in one script. See: https://github.com/yeahdongcn/MacAI/blob/main/start.sh
Noted that the Windows tests are currently failing; I'll work on fixing them tomorrow.
> Hi @yeahdongcn thank you for contributing this over. Is the Moore team going to keep this integration always up-to-date? > > how should we be testing against this on...
Hi @mchiang0610 I’ve discussed this with my manager, and instead of sending the GPU hardware directly, we believe it would be more efficient to set up an in-house environment to...
Since https://github.com/ollama/ollama/pull/8539 introduces significant changes to the build process, I'll need some time to update this PR accordingly.
> Since #8539 introduces significant changes to the build process, I'll need some time to update this PR accordingly. Done. Please check the latest commits for details.
@jmorganca @mchiang0610 @mxyng @dhiltgen Could you please review this PR? The new commits after the rebase look cleaner.