nos
nos copied to clipboard
⚡️ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW.
- https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1 - https://huggingface.co/mistralai/Mixtral-8x7B-v0.1
Default mins on cpu count and container mem not possible on M2 Macbook air, though most models will run with more relaxed limits. Let's just add this as a dedicated...
https://github.com/autonomi-ai/nos/blob/main/docs/concepts/model-manager.md
https://github.com/autonomi-ai/nos/blob/main/docs/concepts/model-spec.md
Avoid having to generate the pb2 files by the client and instead generate pb files on `make dist` and add it to the wheel file.
Ability to register module from class definition: (expose `ModuleFromCls` in `InferenceModule`).
https://github.com/autonomi-ai/nos/blob/main/docs/concepts/runtime-environments.md