vllm
vllm copied to clipboard
[Model] Add support for GraniteMoeShared models
Adds support for the granitemoeshared
model type which is based on granitemoe
but with the addition of a shared experts layer. A preview model with this architecture can be found at ibm-research/moe-7b-1b-active-shared-experts.
transformers
support for this GraniteMoeShared
model was recently merged and requires transformers >= v4.49.0