MITIE icon indicating copy to clipboard operation
MITIE copied to clipboard

why default C = 300 used in mitie for NER training ?

Open munaAchyuta opened this issue 6 years ago • 0 comments

need help ...

sorry guys it's not exactly issue. but it's information i need..

guys please tell me why default C = 300 used in mitie ? when we are doing cross validation with taking input min_C = 0.0001 , max_C = 5000 and epsilon = 1. i believe we are using cross validation for choosing best hyper parameter(C & epsilon). then what is the use of defining C=300 ?

i know what is the use of C (regularisation hyper parameter.)

my problem is : i tried with different value of C but keeping other value as it is. and every time i got same Accuracy and F1 score with different best C value. why is it so ?

please see log ..

=============================================== C=300 num training samples: 1441 C: 200 f-score: 0.734335 C: 400 f-score: 0.735081 C: 300 f-score: 0.731994 C: 500 f-score: 0.735241 C: 700 f-score: 0.734709 C: 520 f-score: 0.733273 C: 450.957 f-score: 0.733804 C: 483.4 f-score: 0.736308 C: 480.156 f-score: 0.735241 C: 490.078 f-score: 0.734653 C: 484.607 f-score: 0.735241 C: 482.381 f-score: 0.732305 C: 483.799 f-score: 0.734653 C: 483.236 f-score: 0.732149 best C: 483.4

test on train: 286 2 0 3 0 759 0 3 0 0 43 0 4 6 0 335

overall accuracy: 0.987509 Part II: elapsed time: 19417 seconds. ============================================== C=100 num training samples: 1420 C: 0.01 f-score: 0.673219 C: 200 f-score: 0.75807 C: 100 f-score: 0.758977 C: 148.954 f-score: 0.758783 C: 124.134 f-score: 0.759333 C: 121.721 f-score: 0.757521 C: 136.154 f-score: 0.760752 C: 134.952 f-score: 0.756639 C: 142.253 f-score: 0.757164 C: 138.668 f-score: 0.758945 C: 137.088 f-score: 0.756806 C: 136.031 f-score: 0.759333 C: 136.479 f-score: 0.759459 best C: 136.154 test on train: 286 2 0 3 0 761 0 1 0 0 43 0 4 9 0 311

overall accuracy: 0.98662 Part II: elapsed time: 6148 seconds. ============================================== C=50 num training samples: 1432 C: 0.01 f-score: 0.670678 C: 200 f-score: 0.754349 C: 100 f-score: 0.755016 C: 149.215 f-score: 0.753461 C: 121.914 f-score: 0.755938 C: 118.753 f-score: 0.753097 C: 134.168 f-score: 0.75631 C: 129.929 f-score: 0.756474 C: 129.128 f-score: 0.755917 C: 131.916 f-score: 0.754349 C: 130.128 f-score: 0.755402 C: 129.586 f-score: 0.755938 best C: 129.929 test on train: 286 2 0 3 0 761 0 1 0 0 43 0 5 10 0 321

overall accuracy: 0.985335 Part II: elapsed time: 5562 seconds. df.number_of_classes(): 4 ============================================== C=300 num training samples: 1455 C: 200 f-score: 0.73822 C: 400 f-score: 0.736475 C: 300 f-score: 0.738895 C: 271.805 f-score: 0.737705 C: 326.638 f-score: 0.735243 C: 292.355 f-score: 0.738378 C: 302.664 f-score: 0.733705 C: 296.35 f-score: 0.736475 C: 298.977 f-score: 0.737146 C: 300.35 f-score: 0.736944 C: 299.649 f-score: 0.738933 C: 299.804 f-score: 0.735961 best C: 299.649 test on train: 288 2 0 1 0 760 0 2 0 0 43 0 5 8 0 346

overall accuracy: 0.987629 Part II: elapsed time: 11576 seconds. df.number_of_classes(): 4

============================================== C=500 Part II: train segment classifier now do training num training samples: 1358 PART-II C: 500 PART-II epsilon: 0.0001 PART-II num threads: 4 PART-II max iterations: 2000 C: 400 f-score: 0.774171 C: 600 f-score: 0.778615 C: 500 f-score: 0.779291 C: 538.343 f-score: 0.774471 C: 470.021 f-score: 0.779522 C: 480.425 f-score: 0.776386 C: 443.145 f-score: 0.774217 C: 463.96 f-score: 0.775954 C: 472.435 f-score: 0.775831 C: 468.168 f-score: 0.770751 C: 470.707 f-score: 0.772416 C: 469.493 f-score: 0.770333 C: 470.138 f-score: 0.779291 best C: 470.021 test on train: 287 2 0 2 0 761 0 1 0 0 43 0 6 9 0 247

overall accuracy: 0.985272 Part II: elapsed time: 18762 seconds. df.number_of_classes(): 4

==============================================

from above log : why best C is coming nearer value of given "C" value ? no matter what C value i choose.

Thanks in advance. @baali @scotthaleen @avitale @kecsap @lopuhin @davisking @jinyichao @avitale

munaAchyuta avatar Mar 15 '18 07:03 munaAchyuta