mergekit
mergekit copied to clipboard
Will the resulting model size increase much after merging
Will the resulting model size increase much after merging Eg: First Model to merge with size M1, Second Model to merge with size M2, and Third Model to merge with size M3 So will be the final merged model with size (M1 + M2 +M3)?
In general, all of the models you merge together will need to be the same size and the output will be that same size as well. So for example if you're merging Mistral models, you'll combine however many 7B models and the output will still be 7B.
There are two exceptions to this. One is if you use the slices: configuration syntax to make a model that has more layers than your input models. (This is commonly called "frankenmerging" and is where models like Goliath or MegaDolphin-120b come from.) The other is the mergekit-moe script, which produces a pseudo-"mixture of experts" that will be approximately the sum of input sizes.
so can't we merge the same parameters in different models like mistral 7B and Openchat 7B?
You can - mistral and openchat are both using the mistral architecture and have the same number of parameters, so you can merge them. The result will also be a 7B parameter mistral-architecture model.
Great thanks. So I have 2 fine-tuned models 1 - Finetuned model in mistral 7B instruct model 2 - Finetuned model in openchat 7B model Hope using the mergekit I can merge both finetuned models and work as a single model
And both finetuned models expect different prompt formats. so how can i handle the prompt formats after making it a single model?