Easy-Transformer
Easy-Transformer copied to clipboard
[Bug Report] Grokking demo currently broken in Colab
The Grokking demo is currently not executing properly in Colab.
Describe the bug The Grokking demo currently fails execution
Code example The block...
print(loss_fn(all_logits, labels)) # This bugged on models not fully trained
Generate the following error....
RuntimeError Traceback (most recent call last)
[<ipython-input-88-6591fc338d6e>](https://xoc33mq2olp-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20240412-060119_RC00_624124077#) in <cell line: 1>()
----> 1 print(loss_fn(all_logits, labels)) # This bugged on models not fully trained
[<ipython-input-18-45ad8e4e85fe>](https://xoc33mq2olp-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20240412-060119_RC00_624124077#) in loss_fn(logits, labels)
4 logits = logits.to(torch.float64)
5 log_probs = logits.log_softmax(dim=-1)
----> 6 correct_log_probs = log_probs.gather(dim=-1, index=labels[:, None])[:, 0]
7 return -correct_log_probs.mean()
8 train_logits = model(train_data)
RuntimeError: Size does not match at dimension 0 expected index [12769, 1] to be smaller than self [113, 113] apart from dimension 1
Skipping this block allows the demo to run for another block, but after that the demo errors out due to a memory error.
System Info Describe the characteristic of your environment:
- Colab T4 GPU
Additional context
Checklist
- [x] I have checked that there is no similar issue in the repo (required)
I can work on this today.