stanford-tensorflow-tutorials
stanford-tensorflow-tutorials copied to clipboard
how to use ubuntu dialog corpus
Hi, We have successfully trained stanford chatbot using cornell movie dialog corpus.But it is giving random answers.We are trying to use Ubuntu Dialog Corpus dataset but we are unable to pre-process it .How can we change the format similar to cornell movie dialog corpus. Otherwise,Can you please suggest any other dataset which is similar to cornell movie dialog corpus.
Thanks in advance.
You can find very well-organized and cleaned conversational dataset (about 160K pairs) for training a chatbot here: https://github.com/bshao001/ChatLearner. That repository also contains scripts and instructions to preprocess reddit data in case you need more (such as million pairs).