neural-api icon indicating copy to clipboard operation
neural-api copied to clipboard

Code GPT-3 Small from the paper Language Models are Few-Shot Learners

Open joaopauloschuler opened this issue 1 year ago • 0 comments

Code architecture similar to GPT-3 Small from the paper Language Models are Few-Shot Learners.

As per table 2.1 from the paper, GPT-3 Small is composed by

  • 12 transformer decoders,
  • 768 hidden dimensions and
  • 3072 intermediate dimensions.

joaopauloschuler avatar Aug 28 '24 01:08 joaopauloschuler