drupchen
drupchen
finding utterances is the same as tokenizing in sentences. This can't be done with the punctuation alone (correct me if I'm wrong). That is what I tried to do with...
There is also another bug that @10zinten reported to me (still waiting for test data): the སྟེ་ ཏེ་ དེ་ particles often seems to be used in the middle of sentences....
I have had the same problem with a file containing a single line without new line. Solved it simply by adding a new line at the end of file. Note:...