Maarten van Gompel comments

Results 365 comments of


                                            Maarten van Gompel

Tree Viewer not adjusted to the the text

That looks wrong indeed, thanks for reporting, I'll have to dive into what went wrong.

Add JSON output as an alternative to 'tabbed' format

This would indeed also be useful for the python-frog binding... more robust parsing..

NER post processing stap toevoegen

Dit sluit een beetje aan bij een vraag die ik heb dus ik stel het hier maar: hoe zit het met de precendence/volgorde van de gazetteers? Ik zie dat ze...

Rerunning frog on already frogged FoliA

We do have to consider the realistic case where somebody only runs certain modules of Frog (say PoS-tagging and lemmatisation) and someone else at a later stage wants to add...

Performance issues on processing huge collections -> revise multithreading implementation

with ``--nostdout``: ``frog-:Frogging in total took: 475 seconds, 806 milliseconds and 28 microseconds`` (might not mean much, load varies)

Performance issues on processing huge collections -> revise multithreading implementation

With tokenisation: ``` frog-:Frogging morr001cryp01_01.notok.folia.xml frog-tok-:ucto: --filter=NO is automatically set. inputclass equals outputclass! frog-:tokenisation took: 3 seconds, 359 milliseconds and 505 microseconds frog-:CGN tagging took: 15 seconds, 951 milliseconds and...

Performance issues on processing huge collections -> revise multithreading implementation

Reducing threads to 4 instead of 40 (no tokenisation, no stdout): ``` frog-:Frogging morr001cryp01_01.tok.folia.xml frog-tok-:ucto: --filter=NO is automatically set. inputclass equals outputclass! frog-:tokenisation took: 0 seconds, 49 milliseconds and 709...

Performance issues on processing huge collections -> revise multithreading implementation

single threaded: ``` frog-:Frogging morr001cryp01_01.tok.folia.xml frog-tok-:ucto: --filter=NO is automatically set. inputclass equals outputclass! frog-:tokenisation took: 0 seconds, 17 milliseconds and 784 microseconds frog-:CGN tagging took: 9 seconds, 568 milliseconds and...

Performance issues on processing huge collections -> revise multithreading implementation

Another minor feature request for debugging: * frog-:Initialization done. (could it report time for this as well?)

Performance issues on processing huge collections -> revise multithreading implementation

So I think the conclusion is to go for parallellisation of document processing when ``--testdir`` is used instead of modules, and to set a smaller amount of threads?