mergekit
mergekit copied to clipboard
Tools for merging pretrained large language models.
it's good just to merge? - basemodel rope : 500,000 - model1 rope : 500,000 - model2 rope : 20,000,000 how can I do?
When merging models with different structures in linear, the following error occurred I understand that errors can occur, but is there a way to skip specific layers where the error...
Hi: Tried a merge (franken) (pass) of these models and got a error : 3B: File "F:\mergekit2\mergekit\mergekit\io\tasks.py", line 86, in execute raise RuntimeError( RuntimeError: Tensor lm_head.weight required but not present...
vllm worker error
I would like to merge the deepseekForCausalLM model. Are there any related examples available? model.index.json { "metadata": { "total_size": 494385573888 }, "weight_map": { "lm_head.weight": "pytorch_model-00100-of-00100.bin", "model.embed_tokens.weight": "pytorch_model-00001-of-00100.bin", "model.layers.0.input_layernorm.weight": "pytorch_model-00003-of-00100.bin", "model.layers.0.mlp.experts.0.down_proj.weight":...
Hello everyone, I have an issue, is there a way to use finetuned models Lora adapters (safetensors) on my own data for merging? Is there someone who managed to realize...
Actual example of a merge that produced this issue: ``` models: - model: Qwen/Qwen2.5-14B-Instruct parameters: weight: 0.3 density: 0.4 merge_method: della base_model: parameters: epsilon: 0.05 lambda: 1 dtype: bfloat16 tokenizer_source:...
Hi, Thanks for this wonderful codebase of model merging. I'd like to use it to merge vision models, specifically models that share nearly the same architecture but trained with different...
Is it possible to add support for xlm-roberta? It's the same architecture as roberta, except for a larger vocabulary since it is multi-lingual.
I followed the installation instructions. Unfortunately, this process failed to create the mergekit-yaml file. As a result, any commands requiring it does not work.