Kurt Shuster comments

Results 198 comments of


                                            Kurt Shuster

[Wizard of Wikipedia] Questions about how to used the trained agent

We typically pass the topic in as the very first message in the dialogue history; so, your first turn would instead be: ```python first_turn = "\n".join( [ "Movies", "your persona:...

[Wizard of Wikipedia] Questions about how to used the trained agent

Each "your persona: " line is a persona you would want the bot to emulate. If you wanted to specify the partner's persona, you would have `"partner's persona: "`

Adding dialogues as a context in the input text

yes, you can do exactly as you've described. it remains to be seen how effective this would be, depending on the model used

Adding dialogues as a context in the input text

any conversational model would be a reasonable choice, bb2_400m is good

Adding dialogues as a context in the input text

fine-tuning on new data while preserving knowledge of older capabilities is an open problem in language modeling. One approach would be to incorporate some original training data within the fine-tuned...

Adding dialogues as a context in the input text

possibly!

Adding dialogues as a context in the input text

Yes, we offer several ways of doing this, via the `--rag-retriever-type`, such as using neural retrieval over document embeddings, of TFIDF over document sets. [Please see the relevant README for...

Adding dialogues as a context in the input text

Only to the extent the bot can fit the tokenized input into it's context; that is determined by the `--truncate` / `--text-truncate` flags if you're curious about bb2, that is...

Adding dialogues as a context in the input text

The real factor is the number of position embeddings the pre-trained models used; the 3B model just so happened to use 128 positions while the 400M used 1024 (3B is...

Adding dialogues as a context in the input text

Ahh, when I said there were no length restrictions I meant that ParlAI itself could handle any length; the models are bound by their truncation length