Hongqiang Guo

Results 2 issues of Hongqiang Guo

@karpathy First of all, thank you so much for sharing your knowledge. I updated the initialization of self.vocab because I don't feel we need to call self._build_vocab(). I also cleaned...

The crop_block_size method in the GPT class only crop wpe.weight and attn.bias. Don't we also need to crop attn.weight, whose shape is (B*T*C) and T is block_size? ![image](https://github.com/karpathy/nanoGPT/assets/161912561/c0955dd8-a5ee-44a3-97e3-6cc02961b635)