CogVideo icon indicating copy to clipboard operation
CogVideo copied to clipboard

What init strategy used when extending 2B model to 5B?

Open spacegoing opened this issue 1 year ago • 1 comments

Feature request / 功能建议

Would love to know the team's experience extending 2B model to 5B. Including init methods, training stages etc.

spacegoing avatar Aug 30 '24 09:08 spacegoing

These are two different models, both of which are trained from scratch. The model structures are somewhat dissimilar, especially in the embedding part. The other differences are mainly in the number of layers in the model and some different hyperparameters, everything else is exactly the same

zRzRzRzRzRzRzR avatar Aug 30 '24 11:08 zRzRzRzRzRzRzR