mixtral-offloading icon indicating copy to clipboard operation
mixtral-offloading copied to clipboard

Can this be used for Jambo inference

Open freQuensy23-coder opened this issue 10 months ago • 1 comments

Can I use this solution for inference https://huggingface.co/ai21labs/Jamba-v0.1/discussions with offloading mamba moe layers?

Jambo it SOTA open source long context model and its support would be very useful for this library.

freQuensy23-coder avatar Apr 08 '24 16:04 freQuensy23-coder

Hey, @freQuensy23-coder! The code in this repo is quite transformer-moe specific. I'm not too familiar with mamba-like architectures, but afaik @lavawolfiee has plans for adapting Jamba to work with our offloading strategy.

dvmazur avatar Apr 11 '24 10:04 dvmazur