Zhang Peiyuan comments

Results 32 comments of


                                            Zhang Peiyuan

Doubt on use of discretized MoL in sampling and loss calculation

Hi @Rayhane-mamah , thanks for your legendary answer! While I have more or less grasped your ideas, I have another question that has bothered me for days: why use an...

revise continue train from initial_iter

@peiji1981 Hi sorry for missing this PR. I will find time to look at it no later than this week!

EMA update on CosineCodebook

@Saltychtao I also encounter a similar issue. Does vq_in refer to VectorQuantize.project_in?

Why Multimodal Chain-of-Thought is stil significantly better than UnifiedQA when there is no visual input?

3. In the paper, you mentioned "during training, the models in both stages are trained independently". Do you mean a single model trained on the two stages sequentially or two...

The queries generated a lot of repetitions. Possible to provide 1T again "fix"?

@RonanKMcGovern I just tested out all TinyLlama's chat model (V0.1 to V0.6) and the model does not generate repetition. Not sure why it is the case for you? Below is...

The lighting app was updated, and does not support run model ！

Updated: https://github.com/jzhang38/TinyLlama/blob/main/requirements.txt

How do you plan on dealing with hallucinations due to knowledge compression?

Exploring retrieval augmented generation is on our TODO list!

How do you plan on dealing with hallucinations due to knowledge compression?

Yes, we are currently reading papers about Retrieval Augmented LM to find out what training/adaptation setup to RAG is better suited for TinyLlama. It we be great if you could...

Hi, Artnoage. If you check the [training log](https://wandb.ai/lance777/lightning_logs/reports/metric-train_loss-23-09-04-23-38-15---Vmlldzo1MzA4MzIw?accessToken=5eu2sndit2mo6eqls8h38sklcgfwt660ek1f2czlgtqjv2c6tida47qm1oty8ik9) I actually resumed the process twice and did not notice any memory error. I am not sure why that is the case...

Resuming training

>When you did the first run, did you check the memory usage? The memory usage is always 39G on my end.