minGPT The definitions of `B, T, C `

https://github.com/karpathy/minGPT/blob/4050db60409b5bbaaa3302cee1e49847fc145c65/mingpt/model.py#L62

and referred from http://jalammar.github.io/illustrated-gpt2/

I am remain confused about the definitions of B, T, C = x.size().

vocabulary length, batch_size, and tokenizer size, etc?

Thanks.

Sep 08 '20 14:09 xiejiachen

B is the batch size, T is the sequence length, and C is the dimensionality of the embedding (n_embd).

At the first layer, if your batch size were 16, n_embd=768, and block_size=128 then the input to the layer would be a (16, 128, 768) tensor, giving you B=16, T=128, C=768.

karpathy could likely shed light on what specifically the letters are short for. B is obviously just short for Batch. I would guess that T is short for Tokens. C is likely short for Channels, probably for historical reasons. In CNNs the dimensionality of each "pixel" is called the number of channels, simply because images used that terminology to specify how many color channels they had. I think PyTorch ends up using C as short hand for that axis in its API a lot.

Sep 08 '20 19:09 fpgaminer

B is the batch size, T is the sequence length, and C is the dimensionality of the embedding (n_embd).

At the first layer, if your batch size were 16, n_embd=768, and block_size=128 then the input to the layer would be a (16, 128, 768) tensor, giving you B=16, T=128, C=768.

karpathy could likely shed light on what specifically the letters are short for. B is obviously just short for Batch. I would guess that T is short for Tokens. C is likely short for Channels, probably for historical reasons. In CNNs the dimensionality of each "pixel" is called the number of channels, simply because images used that terminology to specify how many color channels they had. I think PyTorch ends up using C as short hand for that axis in its API a lot.

Really clearly, thanks a lot.

Sep 09 '20 04:09 xiejiachen

Hello! Why it appears this issue still open?

Thanks

Jan 04 '23 20:01 nil-andreu

minGPT minGPT copied to clipboard

The definitions of `B, T, C `

minGPT
minGPT copied to clipboard