mergekit icon indicating copy to clipboard operation
mergekit copied to clipboard

Tools for merging pretrained large language models.

Results 231 mergekit issues
Sort by recently updated
recently updated
newest added

Hi! Thanks for your great work! I have two questions. (1) When I use the following setting ``` models: - model: /data2/model/Quantize/llama2-chat_normal parameters: weight: 0.1 - model: /data2/model/Quantize/llama2-chat_normal parameters: weight:...

## Problem In a Mixture of Experts (MoE) LLM, the gating network outputs a categorical distribution of $n$ values (chosen from $n_{max}$), which is then used to create a convex...

Has anyone tried downscaling the K and/or Q matrices for repeated layers in franken-merges? This should act like changing the temperature of the softmax and effectively smooth the distribution: **Hopfield...

If one trains at a context window of 8K can one merge with a model of same architecture with longer context window? Say train https://huggingface.co/meta-llama/Meta-Llama-3-8B trained merged into https://huggingface.co/NurtureAI/Meta-Llama-3-8B-Instruct-64k

I'm trying to merge some embedding models with this config file. the architectures are similar but I think it is erroring out on some names of layers? Would love some...

```bash python dump_out.py gpt2 -o dump_output --dump-type hidden-state -d metric-space/experiment_med -s 2 -c question -u part1 ``` ```bash python dump_out.py gpt2 -o dump_output --dump-type activation -d metric-space/experiment_med -s 2 -c...

Thanks for this amazing work. It makes everything easier to merge models. I read this paper recently and the proposed method, Variation Ratio Merge (VARM), is also a novel merge...

As below .ipynb code you provided, where can I specify a GPU to let the merging process run on ? https://github.com/arcee-ai/mergekit/blob/main/notebook.ipynb

Happy to see that you have implemented evolutionary merging. I tried to follow your tutorial: https://blog.arcee.ai/tutorial-tutorial-how-to-get-started-with-evolutionary-model-merging/ The installation example causes an error. I almost gave up but then I found...

Hi @cg123, Great library, thanks a lot, super useful! I've finetuned GPT2 on 2 tasks (model1 and model2) and am trying to merge using your repo. It turns out, using...