candle icon indicating copy to clipboard operation
candle copied to clipboard

Add the Phi 3.5 MoE model

Open EricLBuehler opened this issue 1 year ago • 0 comments

The Phi 3.5 MoE model is a ~42B parameter model with 16 experts, using 2 active.

This PR implements the model and provides a simple inference example.

Additionally, this PR adds a layers module to candle_transformers. Perhaps we can use this to store useful layers, such as Phi 3 or Llama RoPE. In particular, the Phi 3 RoPE implementation has been added. This is currently only used in the MoE model, but I was wondering if we could also use this in the regular Phi 3 model?

EricLBuehler avatar Sep 03 '24 02:09 EricLBuehler