mergekit issues

mergekit saves tied and ignored weights unlike what transformers does when saving

An example of this in the wild is Gemma2, where saving a Gemma2 model with `save_pretrained` ignores the `lm_head` tensor, due to it being a tied weight, whereas a Gemma2...

nyxkrage

Create Communication Channels for MergeKit

Set up communication channels (e.g., Discord, Slack) to enhance collaboration and support. Update README.md with links and announce on GitHub and Hugging Face Spaces.

aditya-cherukuru

How to Create a New Merging Method

1

If I want to design a new merging method by mergekit, what steps do I need to follow? Which files do I need to modify?

Guozhenyuan

does not appear to have a file named config.json

2

models: - model: stabilityai/stable-diffusion-xl-base-1.0 parameters: weight: 1.0 lora: - path: ehristoforu/dalle-3-xl-v2 - model: stabilityai/stable-diffusion-xl-base-1.0 parameters: weight: 0.5 lora: - path: artificialguybr/3DRedmond-V1 merge_method: linear dtype: float16 what can i do for...

bxf1001

Questions about Config

2

Thank you for your outstanding contribution! I am not very familiar with the configuration of model fusion, and I couldn't find explanations for each parameter in the repository. I encountered...

Zheng-Jay

Added support for DeepseekV2 model

3

This pull request introduces the capability to merge DeepSeekV2 Mixture-of-Experts (MoE) models using MergeKit. To facilitate this, a `deepseekv2.json` configuration file has been added to the architecture directory. Additionally, a...

aditya-29

RuntimeError: Unsupported architecture BertForSequenceClassification

Hi, Trying to merge multiple BERT models in scope of text-classification. When doing a simple linear merge, i got the following error message `RuntimeError: Unsupported architecture BertForSequenceClassification`. Having a closer...

lrsbrgrn

mergekit-moe支持qwen吗？

1

![1721889849832](https://github.com/user-attachments/assets/7d9b57ed-74cf-4edd-b6aa-61adf2f1bece) mergekit-moe支持qwen系列的模型吗，我想把两个基于lora微调并合并后的qwen模型与没有微调的qwen合并为moe模型，但是我尝试了qwen、qwen1.5和qwen2三个版本，都得到图片的结果。到底是否支持qwen呢？支持哪个版本呢？

hoooooli

Thera are still some problems with moe merge qwen with other LLM(like llama,deepseek,etc)

3

Here is one piece of code In the file of mergekit/mergekit/moe/qwen.py `for model_ref in ( [config.base_model] + [e.source_model for e in config.experts] + [e.source_model for e in (config.shared_experts or [])]...

aoyinke

merging native pytorch model locally

1

hello, is it possible to merge a local model I meant non huggingface model. what should be the parameters in such cases. what I wan is to perform some preprocessing...

sorobedio

mergekit
mergekit copied to clipboard

Metadata

mergekit saves tied and ignored weights unlike what transformers does when saving

Create Communication Channels for MergeKit

How to Create a New Merging Method

does not appear to have a file named config.json

Questions about Config

Added support for DeepseekV2 model

RuntimeError: Unsupported architecture BertForSequenceClassification

mergekit-moe支持qwen吗？

Thera are still some problems with moe merge qwen with other LLM(like llama,deepseek,etc)

merging native pytorch model locally

← Metadata

Owner

Metadata

mergekit mergekit copied to clipboard

Metadata

← Metadata

Owner

Metadata

mergekit
mergekit copied to clipboard