DeepQA icon indicating copy to clipboard operation
DeepQA copied to clipboard

Lot of overfitting

Open mrinal18 opened this issue 7 years ago • 5 comments

the pretrained model is giving a lot of overftted outputs they don't make sense while chatting. can you suggest me anything to reduce it?

mrinal18 avatar Jun 30 '17 12:06 mrinal18

did you just run interactive mode? I notice it doesn't train properly if you do that

Similar question for me at #139 , I think you have to run the non-interactive mode.. and THEN re-run with interactive

notice I omit --test interactive and it goes into a different mode

~/Dev/git/Conchylicultor/DeepQA (master)$ ./main.py --modelTag pretrainedv2 
Welcome to DeepQA v0.1 !

TensorFlow detected: v1.2.1
Loading dataset from ~/git/Conchylicultor/DeepQA/data/samples/dataset-cornell-length10-filter1-vocabSize40000.pkl
Loaded cornell: 24643 words, 159657 QA
Model creation...
2017-08-10 20:39:55.432684: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-10 20:39:55.432710: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-08-10 20:39:55.432715: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-10 20:39:55.432719: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
Initialize variables...
WARNING: No previous model found, starting from clean directory: ~/git/Conchylicultor/DeepQA/save/model-pretrainedv2
Start training (press Ctrl+C to save and exit)...

----- Epoch 1/30 ; (lr=0.002) -----
Shuffling the dataset...
Training:   1%|▌                                                                                     | 4/624 [00:25<1:05:31,  6.34s/it]

EMCP avatar Aug 10 '17 18:08 EMCP

Hey Eric, i did train it again but on different system so it gave me correct responses. But i did manipulate it a bit since it was just giving vague answers most of the times. It's lstm model is not properly built so it doesn't give accurate responses.

mrinal18 avatar Aug 10 '17 19:08 mrinal18

@Mrinal18 , so I'm assuming.. if I run the command without --interactive I'm essentially retraining the model ?? which is why the subsequent chats make more sense?

also could you elaborate your changes? did you just re-read the white paper and check the code itself?

EMCP avatar Aug 10 '17 20:08 EMCP

@Mrinal18 You mind sharing the changes you've made to the model?

LearnedVector avatar Aug 25 '17 20:08 LearnedVector

There seems to be no verification set in this code ? That's why it always overfitting.

PureVoyage avatar Nov 14 '17 02:11 PureVoyage