pytorch-openai-transformer-lm loading pretrained open ai model

loading pretrained open ai model

Open mehdimashayekhi opened this issue 6 years ago • 3 comments

can somebody please explain what are these parameters here https://github.com/huggingface/pytorch-openai-transformer-lm/blob/561d4096be7f66a49b7b989eff09e2ab6ba54bb7/model_pytorch.py#L303, e.g., offsets, init parameters, can you add some comments to this function?, thanks

Jul 20 '18 04:07 mehdimashayekhi

There you go (@thomwolf correct me if I'm wrong on any of these):

n_ctx is the maximum number of token in an input sequence.
n_special is the number of special tokens used to format the input properly. For example in the ROCStories problem, we use 3 additional tokens, _start_, _delimiter_ and _classify_.
n_transfer is the number of pre-trained layers that we will be loaded, the next ones will be initialized randomly.
n_embd is the dimension of the embedding and of the vector associated to each position in the network. It has the value 768 because the network uses multi-head attention with 12 heads and 768 = 12 * 64.

If these comments are helpful to you I will add them in the code.

Jul 24 '18 12:07 rodgzilla

Exactly!

Jul 24 '18 13:07 thomwolf

@rodgzilla thanks!

Jul 26 '18 16:07 mehdimashayekhi

pytorch-openai-transformer-lm pytorch-openai-transformer-lm copied to clipboard

loading pretrained open ai model

pytorch-openai-transformer-lm
pytorch-openai-transformer-lm copied to clipboard