pytorch-openai-transformer-lm icon indicating copy to clipboard operation
pytorch-openai-transformer-lm copied to clipboard

πŸ₯A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI

Results 24 pytorch-openai-transformer-lm issues
Sort by recently updated
recently updated
newest added

I adapted this model to a text classification problem, where my text is concated as: [start] text1 [delimiter] text2 [delimiter] text3 [classify] and it is just a binary classification problem....

Similarity Head and Loss function were tested on the STS-B dataset, achieving nearly the same performance as reported (82.45% PC relative to the 82% in the paper). I can provide...

I'd like to use GPT to encode my dataset and use the representations further for the task of question generation. I have problems with understanding the code and the name...

I am trying to use this repository to train a language model with an additional input. My data looks like this: ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”¬β”€β”€β”€β” β”‚side infoβ”‚startβ”‚The β”‚catβ”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”΄β”€β”€β”€β”˜ ``` The labels...

Hi all, I am trying to train a new dataset with a similar structure to rocstories. It has a story part, 2 options and one correct option. I just added...

Give the transformation method for ROC stories dataset, which is ``` def transform_roc(X1, X2, X3): n_batch = len(X1) xmb = np.zeros((n_batch, 2, n_ctx, 2), dtype=np.int32) mmb = np.zeros((n_batch, 2, n_ctx),...

I know that `n_vocab` is the total number of tokens in encoder dictionary. But when I saw `vocab = n_vocab + n_special + n_ctx`, I was confused, maybe `n_special `...

From the [ConvAI slides](convai.io), it sounds like the Hugging Face submission was based off of this model -- is the code for your ConvAI system available somewhere to take a...

I am confused by the code below. https://github.com/huggingface/pytorch-openai-transformer-lm/blob/eafc28abdfadfa0732f03a0fc65805c5bfb2ffe7/train.py#L52 https://github.com/huggingface/pytorch-openai-transformer-lm/blob/eafc28abdfadfa0732f03a0fc65805c5bfb2ffe7/train.py#L54 Is this due to any normalization? Thanks!

Just curious what would be the place to start to create a seq2seq for response generation on say the persona-chat dataset