mergekit icon indicating copy to clipboard operation
mergekit copied to clipboard

Tools for merging pretrained large language models.

Results 231 mergekit issues
Sort by recently updated
recently updated
newest added

Hi! Please consider implementing Evolutionary Merging Method

This is a new sparsification method that I have been thinking about. The trimming and dropping methods resemble the Top-P and Typical-P methods used in sampling LLMs. However, by far...

I see people are trying to extract the Mistral-22b ancestor from the MoE model by averaging the MLP layers and wondered if the 'model stock' method in Mergekit could be...

Hi guys! I create MoE models with this [config](https://drive.google.com/file/d/1JwJZCQWZyNRCqgpuxQ-u7qELLxMISPko/view?usp=sharing) and this notebook in [gcolab](https://colab.research.google.com/drive/1JfFNBKNkfYEIUvEksEPRN2e7u6WTKS1B) halfway through the merge, I get an error: Warm up loaders: 0% 0/5 [00:00

https://github.com/Leeroo-AI/mergoo

Honestly i haven't went through all the issues to see if this has been discussed before ! But i'm wondering if the core implementation of DARE TIES in this library...

Hi! I used the SLERP algorithm but noticed that it first converts all torch tensors to Numpy and then does the algorithm and converts everything back. Since people would presumably...

It would be very interesting to see the implementation of this type of hybrid model in the future. Link: https://huggingface.co/ai21labs/Jamba-v0.1

Hi all! I recently got interested in model merging and I believe it's a really interesting and promising field of Deep Learning. I really like the vibe of your library...