DialoGPT
DialoGPT copied to clipboard
How to rerank fine-tuned DialoGPT outputs with DialogRPT using HuggingFace Transformers?
I am not satisfied with the responses that DialoGPT produces -- for the most part, they seem pretty random and AI-ish to me. I fine-tuned the model with my dataset using Transformers' Trainer
but that did not help much – the responses are often just quotes from the dataset out of context. I want these quotes to be relevant at least, so I decided to try DialogRPT human-vs-rand and human-vs-machine.
The problem is I do not understand how to rerank DialoGPT responses with DialogRPT using Transformers. Should I use DialogRPT during fine-tuning to compute loss? Or maybe it is possible to connect it as a LogitsProcessor? If yes, then how? As I understand, Transformers' generate()
method outputs scores for every token but DialogRPT outputs a single number. How can I modify the scores of a response then?
I am new to machine learning and this stuff is quite overwhelming for me; any help is very appreciated!
hi @tsutsen thanks for your interest in our work! I'm going to post replies in this issue