Maarten van Gompel
Maarten van Gompel
That looks wrong indeed, thanks for reporting, I'll have to dive into what went wrong.
This would indeed also be useful for the python-frog binding... more robust parsing..
Dit sluit een beetje aan bij een vraag die ik heb dus ik stel het hier maar: hoe zit het met de precendence/volgorde van de gazetteers? Ik zie dat ze...
We do have to consider the realistic case where somebody only runs certain modules of Frog (say PoS-tagging and lemmatisation) and someone else at a later stage wants to add...
with ``--nostdout``: ``frog-:Frogging in total took: 475 seconds, 806 milliseconds and 28 microseconds`` (might not mean much, load varies)
With tokenisation: ``` frog-:Frogging morr001cryp01_01.notok.folia.xml frog-tok-:ucto: --filter=NO is automatically set. inputclass equals outputclass! frog-:tokenisation took: 3 seconds, 359 milliseconds and 505 microseconds frog-:CGN tagging took: 15 seconds, 951 milliseconds and...
Reducing threads to 4 instead of 40 (no tokenisation, no stdout): ``` frog-:Frogging morr001cryp01_01.tok.folia.xml frog-tok-:ucto: --filter=NO is automatically set. inputclass equals outputclass! frog-:tokenisation took: 0 seconds, 49 milliseconds and 709...
single threaded: ``` frog-:Frogging morr001cryp01_01.tok.folia.xml frog-tok-:ucto: --filter=NO is automatically set. inputclass equals outputclass! frog-:tokenisation took: 0 seconds, 17 milliseconds and 784 microseconds frog-:CGN tagging took: 9 seconds, 568 milliseconds and...
Another minor feature request for debugging: * frog-:Initialization done. (could it report time for this as well?)
So I think the conclusion is to go for parallellisation of document processing when ``--testdir`` is used instead of modules, and to set a smaller amount of threads?