transformer
transformer copied to clipboard
Questions regarding the implementation
- In this file https://github.com/hyunwoongko/transformer/blob/master/models/model/transformer.py, you define the functions
make_pad_mask
andmake_no_peak_mask
, but it is actually used during training? - In this file, https://github.com/hyunwoongko/transformer/blob/master/models/layers/position_wise_feed_forward.py why does your
PositionwiseFeedForward
have extra layers?
- yes. see https://github.com/hyunwoongko/transformer/blob/master/models/model/transformer.py#L40.
- what is
extra layers
?
Thank you for your reply :)
For (2), yes you are right actually. I made a mistake when comparing the class PositionwiseFeedForward
with these two implentations; in https://github.com/bentrevett/pytorch-seq2seq/blob/master/6%20-%20Attention%20is%20All%20You%20Need.ipynb and http://nlp.seas.harvard.edu/2018/04/03/attention.html.
But now I see that yours is the same, but you just coded it in a different format.
I have another question, how could we visualise the attention heatmap at the decoder heads, similar to https://github.com/bentrevett/pytorch-seq2seq/blob/master/6%20-%20Attention%20is%20All%20You%20Need.ipynb?
And here. https://github.com/hyunwoongko/transformer/blob/1d2e33f675232956ef4bc3fbb1c3de2300a1f0a7/models/model/transformer.py#L45, here you use "*" symbol to mean a multiplication? In the other implementations, a bitwise "&" operator was used. I am just wondering what is the difference here. Thanks!
@GJ98 could you explain about this? (note this mask implementation was not written by me)
I have another question, how could we visualise the attention heatmap at the decoder heads, similar to
I was planning to implement it, but I didn't do it because I didn't have enough time. I welcome PR!
@Yuvaraj91 There is no difference between "*" and "&". I think "&" can be more clear than "*".
Ok thank you both @GJ98 @hyunwoongko !
Another question, where do you get the value of 256 from? https://github.com/hyunwoongko/transformer/blob/1d2e33f675232956ef4bc3fbb1c3de2300a1f0a7/conf.py#L13
What do you mean?
Hi, could you share all the requirements of this repo, like pytorch version etc. Thanks.