zhangruoceng issues

Repositories
Issues
Comments

Results 2 issues of


                                            zhangruoceng

Should embed_tokens.weight and lm_head.weight be frozen in stage1 and stage 2

In stage1 and stage2 these two weights are trainable. And the layer name is "llama_model.model.embed_tokens.weight" and "llama_model.lm_head.weight" But it seems that stage3 not load these two weights correctly, as the...

The question of loss and token_acc when training stage1 on cc3m dataset

Thank you for the release of the code! I'm training next-gpt stage1 on cc3m dataset just for text-image modality. How low is the loss to be considered as convergence? And...