zhangruoceng
Results
2
issues of
zhangruoceng
In stage1 and stage2 these two weights are trainable. And the layer name is "llama_model.model.embed_tokens.weight" and "llama_model.lm_head.weight" But it seems that stage3 not load these two weights correctly, as the...
Thank you for the release of the code! I'm training next-gpt stage1 on cc3m dataset just for text-image modality. How low is the loss to be considered as convergence? And...