denspi
denspi copied to clipboard
neg training code is different from the paper
Currently the neg training routine (--train_neg
in run_piqa.py
) is different from what is described in the paper.
In the paper, we use 'no answer' logit to train on negative examples so we just don't have a separate neg training routine. In the code, we have a neg training routine that instead attaches the neg example to each positive example (whose question embeddings are similar) after normal training.
In the code, several noise injections are also used.
In practice, the strategy in the current code is better than that in the paper (no answer logit). The paper will be updated soon and this issue will be resolved.