Zhang Peiyuan
Zhang Peiyuan
Hi @Rayhane-mamah , thanks for your legendary answer! While I have more or less grasped your ideas, I have another question that has bothered me for days: why use an...
@peiji1981 Hi sorry for missing this PR. I will find time to look at it no later than this week!
@Saltychtao I also encounter a similar issue. Does vq_in refer to VectorQuantize.project_in?
3. In the paper, you mentioned "during training, the models in both stages are trained independently". Do you mean a single model trained on the two stages sequentially or two...
@RonanKMcGovern I just tested out all TinyLlama's chat model (V0.1 to V0.6) and the model does not generate repetition. Not sure why it is the case for you? Below is...
Updated: https://github.com/jzhang38/TinyLlama/blob/main/requirements.txt
Exploring retrieval augmented generation is on our TODO list!
Yes, we are currently reading papers about Retrieval Augmented LM to find out what training/adaptation setup to RAG is better suited for TinyLlama. It we be great if you could...
Hi, Artnoage. If you check the [training log](https://wandb.ai/lance777/lightning_logs/reports/metric-train_loss-23-09-04-23-38-15---Vmlldzo1MzA4MzIw?accessToken=5eu2sndit2mo6eqls8h38sklcgfwt660ek1f2czlgtqjv2c6tida47qm1oty8ik9) I actually resumed the process twice and did not notice any memory error. I am not sure why that is the case...
>When you did the first run, did you check the memory usage? The memory usage is always 39G on my end.