mergekit icon indicating copy to clipboard operation
mergekit copied to clipboard

Added support for DeepseekV2 model

Open aditya-29 opened this issue 1 year ago • 4 comments

This pull request introduces the capability to merge DeepSeekV2 Mixture-of-Experts (MoE) models using MergeKit. To facilitate this, a deepseekv2.json configuration file has been added to the architecture directory. Additionally, a custom class analogous to Mixtral has been implemented to enable model merging based on the JSON configuration.

aditya-29 avatar Jul 26 '24 06:07 aditya-29

@aditya-29 can you please respond :) sorry for the late reply.

shamanez avatar Aug 28 '24 18:08 shamanez

Thanks @metric-space and @shamanez. I didn't get a chance to go over the comments earlier. I will work on the suggested changes and reach out to you for any clarification

aditya-29 avatar Aug 28 '24 19:08 aditya-29

Thanks a lot mate.

shamanez avatar Aug 28 '24 21:08 shamanez

@cg123 is there plan to support DeepseekV2 and DeepseekV3?

ehartford avatar Jan 24 '25 18:01 ehartford