transfer-learning-conv-ai
transfer-learning-conv-ai copied to clipboard
Sample personality too slow and repeated question
-
when you call get_dataset_personalities(tokenizer, args.dataset_path, args.dataset_cache) it parser from personachat_self_original.json which contains the whole training set and take long time. Think it's better to sample from a smaller file.
-
Although the interact input contains chat history, but the model did't do well. Here are some chat example:
Selected personality: my mom is my best friend. i have four sisters. i believe that mermaids are real. i love iced tea. me: it's so good to have sisters it is. my mom is my best friend. me: how about your sisters they are my sisters. me: how old are you i'm 34 and you? me: i'm also 34 what a coincidence you should get married me: yes i'm married, and have a daughter i believe mermaids are real me: are you married? no, but i do believe mermaids are real me: i don't believe that how old are you? me: you already know that
It repeated the same question
Is this because the model doesn't have the long range attention mechanism?
Hi @linkinbird,
Even though I made the code as simple as I could for people to use and test, this was the sota on the automatic metrics of the convai2 dialog competition of last December which means that all developments of it are basically open-research questions and I don't have definite answers for you.
Of course, you can just filter the output to avoid repetition but I don't think it's interesting in terms of science and I didn't add any kind of heuristics like this in the code.
Repetition is an open-problem in dialog and language generation, even for large models like GPT-2 medium. What about you play with the code and test some ideas and if you find some new solution to this problem, you should write a paper and submit it to a conference or a workshop (and open-source the code!)? :-)
Thanks @thomwolf , very informative. I'll search more about it.