Maarten van Gompel

Results 365 comments of Maarten van Gompel

That looks wrong indeed, thanks for reporting, I'll have to dive into what went wrong.

This would indeed also be useful for the python-frog binding... more robust parsing..

Dit sluit een beetje aan bij een vraag die ik heb dus ik stel het hier maar: hoe zit het met de precendence/volgorde van de gazetteers? Ik zie dat ze...

We do have to consider the realistic case where somebody only runs certain modules of Frog (say PoS-tagging and lemmatisation) and someone else at a later stage wants to add...

with ``--nostdout``: ``frog-:Frogging in total took: 475 seconds, 806 milliseconds and 28 microseconds`` (might not mean much, load varies)

With tokenisation: ``` frog-:Frogging morr001cryp01_01.notok.folia.xml frog-tok-:ucto: --filter=NO is automatically set. inputclass equals outputclass! frog-:tokenisation took: 3 seconds, 359 milliseconds and 505 microseconds frog-:CGN tagging took: 15 seconds, 951 milliseconds and...

Reducing threads to 4 instead of 40 (no tokenisation, no stdout): ``` frog-:Frogging morr001cryp01_01.tok.folia.xml frog-tok-:ucto: --filter=NO is automatically set. inputclass equals outputclass! frog-:tokenisation took: 0 seconds, 49 milliseconds and 709...

single threaded: ``` frog-:Frogging morr001cryp01_01.tok.folia.xml frog-tok-:ucto: --filter=NO is automatically set. inputclass equals outputclass! frog-:tokenisation took: 0 seconds, 17 milliseconds and 784 microseconds frog-:CGN tagging took: 9 seconds, 568 milliseconds and...

Another minor feature request for debugging: * frog-:Initialization done. (could it report time for this as well?)

So I think the conclusion is to go for parallellisation of document processing when ``--testdir`` is used instead of modules, and to set a smaller amount of threads?