mergekit icon indicating copy to clipboard operation
mergekit copied to clipboard

mixtral branch: dimention mismatch in `cheap_embed`

Open Spico197 opened this issue 1 year ago • 0 comments

Here, the dimention in cheap_embed is 4-dimentional tensors: https://github.com/cg123/mergekit/blob/d55f654c2e70d3ac4ad6532de96e266aff2de931/mergekit/scripts/mixtral_moe.py#L87

However, the gate_vec receive a 3-dimentional tensor. https://github.com/cg123/mergekit/blob/d55f654c2e70d3ac4ad6532de96e266aff2de931/mergekit/scripts/mixtral_moe.py#L158-L161

Spico197 avatar Jan 15 '24 10:01 Spico197