DialoGPT icon indicating copy to clipboard operation
DialoGPT copied to clipboard

How to rerank fine-tuned DialoGPT outputs with DialogRPT using HuggingFace Transformers?

Open tsutsen opened this issue 3 years ago • 1 comments

I am not satisfied with the responses that DialoGPT produces -- for the most part, they seem pretty random and AI-ish to me. I fine-tuned the model with my dataset using Transformers' Trainer but that did not help much – the responses are often just quotes from the dataset out of context. I want these quotes to be relevant at least, so I decided to try DialogRPT human-vs-rand and human-vs-machine.

The problem is I do not understand how to rerank DialoGPT responses with DialogRPT using Transformers. Should I use DialogRPT during fine-tuning to compute loss? Or maybe it is possible to connect it as a LogitsProcessor? If yes, then how? As I understand, Transformers' generate() method outputs scores for every token but DialogRPT outputs a single number. How can I modify the scores of a response then?

I am new to machine learning and this stuff is quite overwhelming for me; any help is very appreciated!

tsutsen avatar May 04 '21 18:05 tsutsen

hi @tsutsen thanks for your interest in our work! I'm going to post replies in this issue

golsun avatar May 04 '21 18:05 golsun