linguistic-style-transfer icon indicating copy to clipboard operation
linguistic-style-transfer copied to clipboard

report a big PPL score on yelp

Open hejunqing opened this issue 5 years ago • 1 comments

Hello. Thanks for sharing your work. I trained a model following the steps in README and ran the evaluation using the run_all_evaluator.sh It turns out most of the metrics are identicle to the results reported in your paper except PPL. The results for my trained model are : ll_scores: [(-9.701861720617387, 106.5074394250216), (-10.269295644873736, 120.9065905583248)] The mean PPL is 113.7 However, the results should be around 32. I think it may attribute to a different vocabulary or training KenLM with different corpus. I directly used the yelp_corpus_adapter for data preparation and yelp/reviews-train.txt to train KenLM. Did I miss something ?

hejunqing avatar Oct 14 '19 06:10 hejunqing

I have the same issue. I tried training the language model on the dev and test split as well but got a similar PPL. Notably the overall_evaluator.py script should be changed in line 62 to ll_score, ppl_score = language_fluency.score_generated_sentences(generated_text_file_path, options.language_model_path) and in line 68 to ll_scores.append(ppl_score) because it formerly wanted to output a tuple of negative log likelihood and perplexity (might have something to do with Kenlm versions).

vrublack avatar Dec 03 '19 13:12 vrublack