Ottokar Siirak

Results 41 comments of Ottokar Siirak

Hi! punctuator currently does not support multiple GPUs. I've managed to train all my models in a few days, which has been acceptable for me. These functions in models.py have...

Hi! I did start with an encoder-decoder model, but realized that it's a slight overkill for punctuation restoration/sequence labelling. Encoder-decoder architecture is designed to solve problems where the length of...

Hi, thanks for reporting! It should actually work if you don't make any changes to the original main2.py script (I see you have removed the "p=p" line). Current implementation is...

@aliabbasjp yes, I understand, but try with unmodified code and follow the instructions in the readme (slighlty modified below): Data preparation. In there should be *.test.txt, *.dev.txt and other *.txt...

added dummy pauses into punctuator.py script. Can you pull the new version and try something like: `cat data.dev.txt | python punctuator.py 1` That 1 in the end is important.

Hi and thank you! The pause annotation format looks correct, but I currently don't have any English models trained on pause annotated data, because I just don't have the right...

The TED dataset was preprocessed by the authors of http://www.lrec-conf.org/proceedings/lrec2016/pdf/103_Paper.pdf and the resulting dataset is shared at: https://drive.google.com/file/d/0B13Cc1a7ebTuMElFWGlYcUlVZ0k/view I used this simple script to convert the format of the files:...

Hi! If you have sufficient amount of decent quality training data (10M - 40M words), then the model should be able to learn most of the rules on its own...

Hi! I think the author(s) of https://github.com/alpoktem/punkProse can better answer this question. Best, Ottokar

Hi! I currently don't have any plans to publish the API at the API store. The http://bark.phon.ioc.ee/punctuator API is for demo purposes only, but you can quite easily build your...