hiersumm
hiersumm copied to clipboard
MultiHeadpooling is same with the paper?
hi @nlpyang
I have some questions.
- In your paper, equation (15)is not used, so why you propose that?
equation(13)and equation(14)
I don't know why you do this? Can you give some explanation? In addition, the az and bz do not appear in your code.
scores = self.linear_keys(key)
value = self.linear_values(value)
scores = shape(scores, 1).squeeze(-1)
value = shape(value)
# key_len = key.size(2)
# query_len = query.size(2)
#
# scores = torch.matmul(query, key.transpose(2, 3))
if mask is not None:
mask = mask.unsqueeze(1).expand_as(scores)
scores = scores.masked_fill(mask, -1e18)
You also not use the way from paper to compute your scores. Why? Best Wishes!
-
equation 16 has a typo, where it should be used with \hat(a) not a.
-
you can think a and b as scores and values, which do appear in the code as https://github.com/nlpyang/hiersumm/blob/476e6bf9c716326d6e4c27d5b6878d0816893659/src/abstractive/attn.py#L241 https://github.com/nlpyang/hiersumm/blob/476e6bf9c716326d6e4c27d5b6878d0816893659/src/abstractive/attn.py#L242
-
I can't see why this is different from the paper.