nos issues

Mixtral 8x7b support

- https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1 - https://huggingface.co/mistralai/Mixtral-8x7B-v0.1

user-story

Custom runtime config for macos with relaxed resource requirements

Default mins on cpu count and container mem not possible on M2 Macbook air, though most models will run with more relaxed limits. Let's just add this as a dedicated...

outtanames

dev-ex

[docs] Executor (ray + resource management)

spillai

docs

[docs] Model Manager (model spec + resource-aware execution - memory, cpu/gpu/hw resources)

https://github.com/autonomi-ai/nos/blob/main/docs/concepts/model-manager.md

spillai

docs

[docs] Model Spec (model definition + resources needed)

https://github.com/autonomi-ai/nos/blob/main/docs/concepts/model-spec.md

spillai

docs

Distribute generated `*_pb2.py` files with wheel file

Avoid having to generate the pb2 files by the client and instead generate pb files on `make dist` and add it to the wheel file.

spillai

distribution

Support new `RuntimeEnv` specifications

spillai

feature

Register Inference module directly from class definition: `RegisterModuleFromCls`

1

Ability to register module from class definition: (expose `ModuleFromCls` in `InferenceModule`).

spillai

feature

[docs] Inference Runtime (ray + conda deps + HW deployment spec)

1

https://github.com/autonomi-ai/nos/blob/main/docs/concepts/runtime-environments.md

spillai

docs

[docs] How to add a new `RuntimeEnv` (client + server)

1

spillai

docs

nos
nos copied to clipboard

Metadata

Mixtral 8x7b support

Custom runtime config for macos with relaxed resource requirements

[docs] Executor (ray + resource management)

[docs] Model Manager (model spec + resource-aware execution - memory, cpu/gpu/hw resources)

[docs] Model Spec (model definition + resources needed)

Distribute generated `*_pb2.py` files with wheel file

Support new `RuntimeEnv` specifications

Register Inference module directly from class definition: `RegisterModuleFromCls`

[docs] Inference Runtime (ray + conda deps + HW deployment spec)

[docs] How to add a new `RuntimeEnv` (client + server)

← Metadata

Owner

Metadata

nos nos copied to clipboard

Metadata

← Metadata

Owner

Metadata

nos
nos copied to clipboard