DeepQA icon indicating copy to clipboard operation
DeepQA copied to clipboard

how to train our own data with small dataset of 10-50kb

Open PoojaPatel05 opened this issue 7 years ago • 2 comments

PoojaPatel05 avatar Nov 15 '17 12:11 PoojaPatel05

You can create your own dataset using a simple custom format where one line correspond to one line of dialogue. Use === to separate conversations between 2 people. Example of conversation file:

https://github.com/Conchylicultor/DeepQA/tree/master/data/lightweight

To use your conversation file .txt, copy it in this repository and launch the program with the option --corpus lightweight --datasetTag <name>.

Using this option you can easily train DeepQA with dataset in any size. I have done it with ~1kb file. Check link for more information.

Amiro0o avatar Nov 15 '17 20:11 Amiro0o

There is no problem, it's nature of Deep Learning. You have to give RNN big dataset to get better results. If you want your bot to answer your dataset questions using AIML is better "but" if you want to make an A.I. that can answer almost any question you have to use Deep Learning. and Deep Learning is all about BIG test data.

Amiro0o avatar Nov 16 '17 11:11 Amiro0o