ng-video-lecture
ng-video-lecture copied to clipboard
Hey @karpathy , I created a high-level UML diagram showcasing what's going on at a high-level in [gpt.py](https://github.com/karpathy/ng-video-lecture/blob/master/gpt.py). This will make it easier for folks to _hack_ the rest of...
fixed "Shadows name 'var' from outer scope." fixed "Variable in function should be lowercase " fixed "Shadows built-in name 'iter' " in iter variables fixed other edit styles like: PEP...
pin
how do I crack a pin showing this (****)?
We are scaling it by number of channels in input at present...
In https://github.com/karpathy/ng-video-lecture/blob/master/gpt.py on lines 136 / 137: ``` # super simple bigram model class BigramLanguageModel(nn.Module): ``` just to clarify, is this now a GPT model and not a bigram model?
Instead of indexing positional embeddings we can slice them. It has couple of benefits: 1. Looks cleaner 2. When indexing - returns a new tensor (plus a new tensor each...
I want to talk about loss calculation in the forward method: ```python else: B, T, C = logits.shape logits = logits.view(B*T, C) targets = targets.view(B*T) loss = F.cross_entropy(logits, targets) return...
The Keras counterpart resembles Andrej Karpathy's original PyTorch code as closely as possible. Not only was this good practice for myself, I hope this will help the Keras practioners as...
It seems m=model.to(device) creates a duplicate pointer to the model that is not needed. Just to make it simpler and clearer. :)