ParsingTime icon indicating copy to clipboard operation
ParsingTime copied to clipboard

problems with training

Open tilneyyang opened this issue 7 years ago • 3 comments

I made some minus changes to make these codes compile under Scala 12 and Java 8. When try to retrain the english Interpretation model, I do meet some problems:

  1. in InterpretationTask.filterCorrect,
		//--Filter
		val good:Iterable[GoodOutput] = O.ckyCountType match {
			case O.CkyCountType.all => 
				//(all parses valid)
				val total:Double = correct.map{_.prob}.sum
				correct.map{ _.normalize(total,correct.length) }
			case O.CkyCountType.bestAll => 
				//(all best-scoring parses valid)
                                 System.out.println("correct size: " + correct.length + ".." + correct.getClass)
				val maxProb = correct.maxBy( _.prob ).prob
				val ok = correct.filter( _.prob == maxProb )
				val total:Double = ok.map{_.prob}.sum
				ok.map{ _.normalize(total,ok.length) }

if correct is an empty list, maxBy on an empty list will cause an exception. Will correct be empty here? For the moment, I add some code, if correct is empty, an empty iterator will be returned directly.

  1. The training process is extremely slow, like four hours later, the program is still at iteration 0, according to the console log. is this normal?

Any kind of help will be appreciated. Thanks!

tilneyyang avatar Jul 18 '17 06:07 tilneyyang

Oh dear, this code was written with something closer to Scala 7 and Java 6, some odd 6 years ago. I'm not sure I'm in a position to support it anymore. Scala especially is notorious for making backwards-incompatible changes, meaning the code may well behave completely differently now than it did at writing.

But, to the best of my recollection:

  1. If there is no correct parse of a sentence, the example should be ignored altogether. The maxBy bug does appear to be a bug though. Perhaps I only ever used the all condition?

  2. Training should not take so long. The whole thing trained overnight 6 years ago, which means it should finish in a matter of hours now.

gangeli avatar Jul 18 '17 07:07 gangeli

[timex 103] a year ago {
          34701 candidates
          best guess: ScoreElem(0,0,(P2147483647Y,P2147483647Y),-24.55872625704437,[x, x+P2147483647Y))
          second best guess: ScoreElem(1,0,(P2147483647Y,P2147483647Y),-24.55872625704437,[x, x+P2147483647Y))
          WRONG
          0 in beam
          Guess:  [1989-10-25T16:00:00.000Z, 292278994-08-17T07:12:55.807Z)
          Gold:   [1988-06-30T15:00:00.000Z, 1988-09-30T16:00:00.000Z)
          Tree:   ("__ROOT__" ("Range" ("Range" ("Range" ( "future" "a" ) ) ( "nil" "year" ) ) ( "nil" "ago" ) ) ) 
          Ground: 1989-10-25T16:00:00.000Z
zero output
        } [30:55.0485 minutes]

I think there must be something wrong. I will check the code again to see what I can do. Thanks for the reply!

tilneyyang avatar Jul 18 '17 11:07 tilneyyang

My name is shuwenjian .I like yellow T-shirts and blue trousers.my favourite istennis. I lost a ball. I hope you can help me find it.thinks!l want to is wangjia make friend

Hizees avatar Dec 14 '19 16:12 Hizees