Rahul C comments

Results 16 comments of


                                            Rahul C

Could you provide the training hyper paras

Hi @juyiming the hyper-parameters for all GLUE Tasks (which includes MNLI/MRPC) & SQUAD_V2.0 are available in the Appendix A4 section. ![image](https://user-images.githubusercontent.com/16897807/131207616-09c358a6-b408-473a-ad6e-1daea8811e71.png)

some bugs here , I tried your code ,and .....

Change the loss function in train.py to below. ``` def loss(self, x, target, m): #x:b,10 target:b target = target*x zero_tensor = torch.tensor(0.0).cuda() loss = torch.max(zero_tensor,m-(target-x))**2 loss = torch.mean(loss) return loss...

some bugs here , I tried your code ,and .....

Also considering you got an issue at line 43, you may additionally get an error at line _102_ `acc = pred.eq(labels).cpu().sum().data[0]` in that case change it to `acc = pred.eq(labels).cpu().sum().item()`

Stuck on training: Created a PretokDataset with rng seed 42

@CatTimson @madroidmaq can you try setting `pin_memory=False` in this [line](https://github.com/karpathy/llama2.c/blob/bd182289c596fa6059eb7b3b7c8ccd04b5c90fc3/tinystories.py#L238C40-L238C50) ?

Stuck on training: Created a PretokDataset with rng seed 42

> > @CatTimson @madroidmaq can you try setting `pin_memory=False` in this [line](https://github.com/karpathy/llama2.c/blob/bd182289c596fa6059eb7b3b7c8ccd04b5c90fc3/tinystories.py#L238C40-L238C50) ? > > @RahulSChand According to your method, my problem disappeared, I can train and see the detailed...

Stuck on training: Created a PretokDataset with rng seed 42

@CatTimson what is your PyTorch version? Use `print(torch.__version__)` Pytorch `2.0.1+cu117` works for me. You can get same version in new environment by `pip install torch==2.0.0+cu117 --index-url https://download.pytorch.org/whl/cu117`

Stuck on training: Created a PretokDataset with rng seed 42

@madroidmaq can you remove this `time.time()` from following lines?https://github.com/karpathy/llama2.c/blob/bd182289c596fa6059eb7b3b7c8ccd04b5c90fc3/train.py#L249 https://github.com/karpathy/llama2.c/blob/bd182289c596fa6059eb7b3b7c8ccd04b5c90fc3/train.py#L322 Just put t0=0 and t1=0 & check? Also you can change below to `print(...., flush=True)` to see if the issue...