pytorch-openai-transformer-lm
pytorch-openai-transformer-lm copied to clipboard
How to create transforms for entailment task?
Give the transformation method for ROC stories dataset, which is
def transform_roc(X1, X2, X3):
n_batch = len(X1)
xmb = np.zeros((n_batch, 2, n_ctx, 2), dtype=np.int32)
mmb = np.zeros((n_batch, 2, n_ctx), dtype=np.float32)
start = encoder['_start_']
delimiter = encoder['_delimiter_']
for i, (x1, x2, x3), in enumerate(zip(X1, X2, X3)):
x12 = [start] + x1[:max_len] + [delimiter] + x2[:max_len] + [clf_token]
x13 = [start] + x1[:max_len] + [delimiter] + x3[:max_len] + [clf_token]
l12 = len(x12)
l13 = len(x13)
xmb[i, 0, :l12, 0] = x12
xmb[i, 1, :l13, 0] = x13
mmb[i, 0, :l12] = 1
mmb[i, 1, :l13] = 1
# Position information that is added to the input embeddings in the TransformerModel
xmb[:, :, :, 1] = np.arange(n_vocab + n_special, n_vocab + n_special + n_ctx)
return xmb, mmb
I have created following transforms for entailment task:
def transform_entailment(X1, X2):
n_batch = len(X1)
xmb = np.zeros((n_batch, 1, n_ctx, 2), dtype=np.int32)
mmb = np.zeros((n_batch, 1, n_ctx), dtype=np.float32)
start = encoder['_start_']
delimiter = encoder['_delimiter_']
for i, (x1, x2), in enumerate(zip(X1, X2)):
x12 = [start] + x1[:max_len] + [delimiter] + x2[:max_len] + [clf_token]
l12 = len(x12)
xmb[i, 0, :l12, 0] = x12
mmb[i, 0, :l12] = 1
# Position information that is added to the input embeddings in the TransformerModel
xmb[:, :, :, 1] = np.arange(n_vocab + n_special, n_vocab + n_special + n_ctx)
return xmb, mmb
Using this I am getting following error during loss computation:
Namespace(afn='gelu', analysis=False, attn_pdrop=0.1, b1=0.9, b2=0.999, bpe_path='data/dataset_tweet_encode/vocab_40000.bpe', clf_pdrop=0.1, data_dir='data/', dataset=None, desc=None, e=1e-08, embd_pdrop=0.1, encoder_path='data/dataset_tweet_encode/encoder_bpe_40000.json', l2=0.01, lm_coef=0.5, log_dir='log/', lr=6.25e-05, lr_schedule='warmup_linear', lr_warmup=0.002, max_grad_norm=1, n_batch=1, n_ctx=512, n_embd=768, n_head=12, n_iter=3, n_layer=12, n_transfer=12, n_valid=0.1, opt='adam', resid_pdrop=0.1, save_dir='save/', seed=42, submission_dir='submission/', submit=False, vector_l2=False)
Traceback (most recent call last):
File "train.py", line 225, in <module>
run_epoch()
File "train.py", line 83, in run_epoch
compute_loss_fct(XMB, YMB, MMB, clf_logits, lm_logits)
File "/home/lordzuko/PycharmProjects/Transformer-Pytorch/loss.py", line 53, in __call__
lm_losses = self.lm_criterion(lm_logits, x_shifted)
File "/home/lordzuko/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "/home/lordzuko/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 862, in forward
ignore_index=self.ignore_index, reduction=self.reduction)
File "/home/lordzuko/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/functional.py", line 1550, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File "/home/lordzuko/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/functional.py", line 1405, in nll_loss
.format(input.size(0), target.size(0)))
ValueError: Expected input batch_size (66) to match target batch_size (0).
Can anyone please guide me through this ?
Do you use ClassificationLossCompute ? Because I had the same problem when I used it for a classification task, but it turns out it computes the same loss (cross entropy) as MultipleChoiceLossCompute, only the views (reshapes) are different but they are bugged in ClassificationLC. So you just have to replace ClassificationLC with MultipleChoiceLC and it should work.
@artemisart Yes I was using the same, made the changes you suggested, now its working. Thank you !!
Do you use ClassificationLossCompute ? Because I had the same problem when I used it for a classification task, but it turns out it computes the same loss (cross entropy) as MultipleChoiceLossCompute, only the views (reshapes) are different but they are bugged in ClassificationLC. So you just have to replace ClassificationLC with MultipleChoiceLC and it should work.
Hi @artemisart @lordzuko ! If there is a bug in ClassificationLossCompute
do you think it's worth opening an issue specifically on it?
xmb = np.zeros((n_batch, 2, n_ctx, 2), dtype=np.int32)
mmb = np.zeros((n_batch, 2, n_ctx), dtype=np.float32)
Hi, can anyone tell me what does the 2 mean between n_batch
and n_ctx
mean?
From my understanding, it's the number of texts to be processed in parallel (for each example) by Transformer, so 1 for text classification, 2 for Story Cloze Test (rocstories), n for multi-choice, etc.
Hi
@artemisart is correct.
I upload the openai-gpt for classification task and it can reproduce the result reported in original paper
What is the use of mmb?
What is the use of mmb?
I see, it is the mask.
@artemisart Yes I was using the same, made the changes you suggested, now its working. Thank you !!
Hi, I am also trying to use the transformer model for the entailment task and trying to replicate the SNLI results in paper. Could you please let me know what changes you made to the train.py file for the entailment task?
Do you use ClassificationLossCompute ? Because I had the same problem when I used it for a classification task, but it turns out it computes the same loss (cross entropy) as MultipleChoiceLossCompute, only the views (reshapes) are different but they are bugged in ClassificationLC. So you just have to replace ClassificationLC with MultipleChoiceLC and it should work.
Hi @artemisart @lordzuko ! If there is a bug in
ClassificationLossCompute
do you think it's worth opening an issue specifically on it?
Hello Davidefiocco,Could you please upload the dataset you have processed? I am also a freshman who wants to see the whole process of running this program. If it is convient of you, you can also send two files to the mailbox [email protected]. Thank you very much.I saw your thoughts and answered, and helped me a lot.
@artemisart Yes I was using the same, made the changes you suggested, now its working. Thank you !!
Very THX!Read your conversation with them,help me a lot.