mergekit icon indicating copy to clipboard operation
mergekit copied to clipboard

Tools for merging pretrained large language models.

Results 231 mergekit issues
Sort by recently updated
recently updated
newest added

An example of this in the wild is Gemma2, where saving a Gemma2 model with `save_pretrained` ignores the `lm_head` tensor, due to it being a tied weight, whereas a Gemma2...

Set up communication channels (e.g., Discord, Slack) to enhance collaboration and support. Update README.md with links and announce on GitHub and Hugging Face Spaces.

If I want to design a new merging method by mergekit, what steps do I need to follow? Which files do I need to modify?

models: - model: stabilityai/stable-diffusion-xl-base-1.0 parameters: weight: 1.0 lora: - path: ehristoforu/dalle-3-xl-v2 - model: stabilityai/stable-diffusion-xl-base-1.0 parameters: weight: 0.5 lora: - path: artificialguybr/3DRedmond-V1 merge_method: linear dtype: float16 what can i do for...

Thank you for your outstanding contribution! I am not very familiar with the configuration of model fusion, and I couldn't find explanations for each parameter in the repository. I encountered...

This pull request introduces the capability to merge DeepSeekV2 Mixture-of-Experts (MoE) models using MergeKit. To facilitate this, a `deepseekv2.json` configuration file has been added to the architecture directory. Additionally, a...

Hi, Trying to merge multiple BERT models in scope of text-classification. When doing a simple linear merge, i got the following error message `RuntimeError: Unsupported architecture BertForSequenceClassification`. Having a closer...

![1721889849832](https://github.com/user-attachments/assets/7d9b57ed-74cf-4edd-b6aa-61adf2f1bece) mergekit-moe支持qwen系列的模型吗,我想把两个基于lora微调并合并后的qwen模型与没有微调的qwen合并为moe模型,但是我尝试了qwen、qwen1.5和qwen2三个版本,都得到图片的结果。 到底是否支持qwen呢?支持哪个版本呢?

Here is one piece of code In the file of mergekit/mergekit/moe/qwen.py `for model_ref in ( [config.base_model] + [e.source_model for e in config.experts] + [e.source_model for e in (config.shared_experts or [])]...

hello, is it possible to merge a local model I meant non huggingface model. what should be the parameters in such cases. what I wan is to perform some preprocessing...