xnmt
xnmt copied to clipboard
Stative Attenders
Right now all the attenders are state-free, but there exist some attenders that require some concept of state. The most obvious one are things that involve coverage.
I propose that we add an AttenderState
, have Attender.init_sent()
return an initial state, and then have Attender.calc_attention()
(and its cousin Attender.calc_context()
) take in an AttenderState
and return a new (i.e. updated) AttenderState
.
Thoughts?
I think this can be merged with Decoder state since only the decoder uses the attender.
On Sat, Feb 9, 2019 at 3:41 PM Austin Matthews [email protected] wrote:
Right now all the attenders are state-free, but there exist some attenders that require some concept of state. The most obvious one are things that involve coverage https://arxiv.org/pdf/1601.04811.pdf.
I propose that we add an AttenderState, have Attender.init_sent() return an initial state, and then have Attender.calc_attention() (and its cousin Attender.calc_context()) take in an AttenderState and return a new (i.e. updated) AttenderState.
Thoughts?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/neulab/xnmt/issues/566, or mute the thread https://github.com/notifications/unsubscribe-auth/AE_GbI9cRyYCr2XQShWeh3l_TNlk2hkDks5vLlGBgaJpZM4ayYKl .
-- Philip
Hmm, that's a good point.
We still would need some sort of mechanism to have the Attender
give some sort of initial state, and then some mechanism to get the next state after adding some input.
Another consideration is that right now the Attender
is a direct descendant of the Model
, and not contained within the Decoder
. That makes it hard, for example, for decoder.init_sent()
to call attender.init_sent()
. Perhaps we should move the attender to within the decoder?
Fixed in this PR.