marian marian_decoder starting and ending logic

marian_decoder starting and ending logic

Open sshleifer opened this issue 4 years ago • 1 comments

I was inspecting intermediate values of the output tensor transformer.h, while running marian_decoder, and noticed that the first step through the decoder some sort of token is passed that has 0 word embedding. Q1) What token is used as a prefix? Are there tricks to make it's embedding 0?

Q2) How does the decoder know to terminate a translation? In my python port of the opus-nmt models, the decoder never predicts ''.

Additional Clues

My python port of the opus-nmt models works nicely when english is the source language, and just generates a dummy token when it is done translating. For fr-en, it generates nonsense at the beginning of the generation, whereas marian-decoder generates no nonsense at all :)

sample_text = "Donnez moi le micro ."
my_result = ', uh... give me the microphone .'] # after constraining max_length.
marian_decoder = 'Give me the microphone!' # after sentencepiece

Thanks in Advance!

Apr 29 '20 15:04 sshleifer

Q1: The embedding of the sentence-start (BOS or <s>) context is hard-coded to be 0. It is not copied from the embedding matrix. I always felt that's a bug, but anecdotally, it makes no accuracy difference.

Q2: Each beam hypothesis that ends in EOS (or </s>) will cease to be expanded. Once all hyps for a sentence end in EOS, sentence translation is complete.

Apr 29 '20 17:04 frankseide

marian marian copied to clipboard

marian_decoder starting and ending logic

Additional Clues

marian
marian copied to clipboard