mamba How to weights merge?

How to weights merge?

Open junphine opened this issue 1 year ago • 5 comments

trafficstars

I trained 3 models, but after averaging the weights, the model output is garbled!

Jan 29 '24 11:01 junphine

This is an unexplored research direction, I'm not sure what the best practices are here.

Jan 29 '24 22:01 albertfgu

1706603511810 In my experiments, averaging the weights seems to speed up training.

Jan 30 '24 08:01 junphine

red is megered model. cyan is one of the there models.
LR is same. But the training data is new to the cyan model and previously visible to the red.

Jan 30 '24 08:01 junphine

Sorry, I'm lacking a lot of context here and am not sure how to help!

Jan 30 '24 16:01 albertfgu

I don't have experience with model merging, keeping this issue open in case there are others who can help.

Jan 30 '24 18:01 tridao