attention-networks-for-classification transpose?

transpose?

Open hungpthanh opened this issue 7 years ago • 3 comments

Why do you need transpose here _s, state_word, _ = word_attn_model(mini_batch[i,:,:].transpose(0,1), state_word)

and here: torch.from_numpy(main_matrix).transpose(0,1) in def pad_batch

Thanks :)

Oct 03 '17 08:10 hungpthanh

I think transpose was used because PyTorch expects the batch_size in the second dimension, it's been a while since I have coded this. But, I have checked all the dimensions from the start to the end when I developed it. :)

Oct 03 '17 10:10 Sandeep42

Thank you so much :+1:

Oct 04 '17 02:10 hungpthanh

@Sandeep42 @hungthanhpham94 I wonder whether there is an error due to what Pytorch is expecting.

In the function train_data(), it's written:

 for i in xrange(max_sents):
        _s, state_word, _ = word_attn_model(mini_batch[i,:,:].transpose(0,1), state_word)

In this way, after the .transpose(0,1), the resulting mini_batch matrix has size (max_tokens, batch_size).

However, the first function to be called is the self.lookup(embed), which is expecting a (batch_size, list_of_indeces).

If this is correct, it requires to fix up all the following code.

Apr 13 '18 17:04 gabrer

attention-networks-for-classification attention-networks-for-classification copied to clipboard

transpose?

attention-networks-for-classification
attention-networks-for-classification copied to clipboard