pytorch-openai-transformer-lm Can someone explain this line?

If my understanding is correct this is finding the places where there is delimiter and filters for them. How does this help with training?

https://github.com/huggingface/pytorch-openai-transformer-lm/blob/253ca422bbf94b19da2a4aa8f1b294e01ab8be37/model_pytorch.py#L207

Jul 09 '18 14:07 teucer

When the information reaches the classification head, it has one vector of dimension n_embd associated to each position of each input. If you want to get a single prediction for each input (as it is the case with classification tasks) you have to select one of these input.

As the transformer network is auto-regressive, the value you select has to be the rightmost one which corresponds to clf_token in the input as it is created like this:

x12 = [start] + x1[:max_len] + [delimiter] + x2[:max_len] + [clf_token]
x13 = [start] + x1[:max_len] + [delimiter] + x3[:max_len] + [clf_token]

Jul 10 '18 15:07 rodgzilla

@rodgzilla Thank you a lot for the explanation. It makes a lot of sense! Out of curiosity, why all the values cannot be used?

Jul 10 '18 16:07 teucer

Well for a classifier, we usually want a fixed length representation of the sentence so we can't really use a varying number of values. Starting from that, the last hidden state is the most logical summary of the sentence. But there are other possible options of course, feel free to try your ideas!

Jul 18 '18 08:07 thomwolf

in original open ai code (https://github.com/openai/finetune-transformer-lm/blob/bd1cf7d678926041e6d19193cab7e5cd8ce2fce6/train.py#L191) in train.py in the model function here in this line clf_logits = clf(clf_h, 1, train=train), why ny is 1?, shouldn't it be 2? because we have two classes. is there a reason to use 1 and then later reshape the logits second dimension to 2?! I really appreciate your help,

Jul 19 '18 23:07 mehdimashayekhi

pytorch-openai-transformer-lm pytorch-openai-transformer-lm copied to clipboard

Can someone explain this line?

pytorch-openai-transformer-lm
pytorch-openai-transformer-lm copied to clipboard