EditNTS
EditNTS copied to clipboard
The input to the Model and the target is the same.
Hi, I have been trying to understand the provided code and have certain concerns about the decoder.
main.py contains the following pieces of code
out = prepared_batch[2][:, :]
tar = prepared_batch[2][:, 1:]
I presume these are the output and the target (and have the same information content ie. edit actions) .
out
is then being fed to the edit_net
, this does not seem to make sense.
output = edit_net(org, out, org_ids, org_pos, simp_ids)
thereafter, In the decoder part, the code uses the same out
does manipulation to create output_t
which is then returned as the result of the EditNet. I am unable to find the parts in your code where prediction for actions is happening.
I am unable to understand that if your model is being fed the edit actions already what exactly is your model predicting?
Hi, @yuedongP , can you provide updates and elaborate on the above?
Hi, this is the standard way to train the seq2seq model with teacher forcing where you would shift the input and output by one and compute the maximum log-likelihood. (please not the out start with index 0 and tar start with index 1, there's a shift here and thus input and output are not the same!) out = prepared_batch[2][:, :] tar = prepared_batch[2][:, 1:]
Here we give the expert edit sequence for training (but at time step t, the model can only see the gold edit label of time t-1) and if you look at the actual model (editNTS.py) you will see that the prediction is being shifted by one during training.
During the inference, the model however does not have this information and has to do inference one edit at a time.
Many thanks for your question and I hope this clarify your concern.