Added support for DeepseekV2 model
This pull request introduces the capability to merge DeepSeekV2 Mixture-of-Experts (MoE) models using MergeKit. To facilitate this, a deepseekv2.json configuration file has been added to the architecture directory. Additionally, a custom class analogous to Mixtral has been implemented to enable model merging based on the JSON configuration.
@aditya-29 can you please respond :) sorry for the late reply.
Thanks @metric-space and @shamanez. I didn't get a chance to go over the comments earlier. I will work on the suggested changes and reach out to you for any clarification
Thanks a lot mate.
@cg123 is there plan to support DeepseekV2 and DeepseekV3?