pytorch-openai-transformer-lm
pytorch-openai-transformer-lm copied to clipboard
loading pretrained open ai model
can somebody please explain what are these parameters here https://github.com/huggingface/pytorch-openai-transformer-lm/blob/561d4096be7f66a49b7b989eff09e2ab6ba54bb7/model_pytorch.py#L303, e.g., offsets, init parameters, can you add some comments to this function?, thanks
There you go (@thomwolf correct me if I'm wrong on any of these):
-
n_ctx
is the maximum number of token in an input sequence. -
n_special
is the number of special tokens used to format the input properly. For example in the ROCStories problem, we use 3 additional tokens,_start_
,_delimiter_
and_classify_
. -
n_transfer
is the number of pre-trained layers that we will be loaded, the next ones will be initialized randomly. -
n_embd
is the dimension of the embedding and of the vector associated to each position in the network. It has the value768
because the network uses multi-head attention with 12 heads and768 = 12 * 64
.
If these comments are helpful to you I will add them in the code.
Exactly!
@rodgzilla thanks!