Vincent Nguyen comments

Results 123 comments of


Vincent Nguyen

[Help Wanted] add Bleu scoring for validation

Well, I think for now, we just need to calculate the corpus_bleu on the validation set. But please include sentence_bleu in the library, it could be useful for future work....

[WIP] The Missing Ingredient in Zero-Shot Neural Machine Translation

For the record, we discussed offline if this should be in the code of NMTModel or Trainer. For performance reason it needs to be in NMTModel (encoding through forward of...

Example-based Reweighting

the operation of reweighting is quite straight forward, the question is more what you want to reweight and based on what criterion. maybe the best is to have a look...

Upgrade torchtext to more recent releases

https://github.com/pytorch/text/issues/969 In fact I think we need to get rid of torchtext completely. Even if we keep it, there are some major changes to implement because Fields are deprecated. My...

doubt on scores when length_penalty = avg

this I know. BUT it means that when lp = None or lp = avg, then "scores" (from "decode_strategy.scores" are not the same, in the first case they are not...

doubt on scores when length_penalty = avg

in other words, here: https://github.com/OpenNMT/OpenNMT-py/blob/c8081afa3fab4e169c34a84f9304cf37184f97d7/onmt/translate/beam_search.py#L179 shouldn't we add log-probs instead ?

doubt on scores when length_penalty = avg

I understand but in a scenario where we want to dump all scores to do some reranking, it means that we cannot handle the same way when the user set...

doubt on scores when length_penalty = avg

@guillaumekln something weird: If I dump scores along with predictions here: https://github.com/OpenNMT/OpenNMT-py/blob/master/onmt/translate/translator.py#L499 n_best_scores = list(map(lambda x: x.item(), trans.pred_scores[: self.n_best])) if self.report_align: align_pharaohs = [ build_align_pharaoh(align) for align in trans.word_aligns[: self.n_best]...

doubt on scores when length_penalty = avg

yes now I recall this long discussion. So to some extent since best pred is not best score, we could de-normalize the lp=avg scores so that at reporting time we...

doubt on scores when length_penalty = avg

where would be the best place to do it ?