pytorch-openai-transformer-lm icon indicating copy to clipboard operation
pytorch-openai-transformer-lm copied to clipboard

loading pretrained open ai model

Open mehdimashayekhi opened this issue 6 years ago • 3 comments

can somebody please explain what are these parameters here https://github.com/huggingface/pytorch-openai-transformer-lm/blob/561d4096be7f66a49b7b989eff09e2ab6ba54bb7/model_pytorch.py#L303, e.g., offsets, init parameters, can you add some comments to this function?, thanks

mehdimashayekhi avatar Jul 20 '18 04:07 mehdimashayekhi

There you go (@thomwolf correct me if I'm wrong on any of these):

  • n_ctx is the maximum number of token in an input sequence.
  • n_special is the number of special tokens used to format the input properly. For example in the ROCStories problem, we use 3 additional tokens, _start_, _delimiter_ and _classify_.
  • n_transfer is the number of pre-trained layers that we will be loaded, the next ones will be initialized randomly.
  • n_embd is the dimension of the embedding and of the vector associated to each position in the network. It has the value 768 because the network uses multi-head attention with 12 heads and 768 = 12 * 64.

If these comments are helpful to you I will add them in the code.

rodgzilla avatar Jul 24 '18 12:07 rodgzilla

Exactly!

thomwolf avatar Jul 24 '18 13:07 thomwolf

@rodgzilla thanks!

mehdimashayekhi avatar Jul 26 '18 16:07 mehdimashayekhi