Chen Wu

Results 5 comments of Chen Wu

Hi, Could you share the command you ran for this experiment?

Is the highest eval score the same as the test score?

Can you run the following command on the same machine (which means that the previous checkpoints are still there) and see if the results are different? ``` python -m torch.distributed.launch...

I just realized that the command you provided is for T5-3b without using deepspeed. I remember that we didn't manage to run without deepspeed even on an A100. What kind...

> love this idea - but probably want to also save it on the class and use it in `select_examples`? Great idea! I implemented this in the latest commit.