Elie Roux
Elie Roux
@drupchen can you give an [mcve](https://stackoverflow.com/help/mcve)?
thanks! Indeed, if we take ``` [('TEXT', 'གཅིག'), ('PUNCT', '།། །། ༆ ། །'), ('TEXT', 'གཉིས')] ``` for instance, ideally it would do something like ``` [('TEXT', 'གཅིག'), ('PUNCT', '།།...
in short yes, I also consulted NT for the code and nagged him with many boring edge cases... I think what's implemented in Java is pretty good, unfortunately it's not...
thanks a lot! Finding the utterances is the first step before segmentation in the workflow to create the ACTIB corpus (and probably others) so I think having a way to...
yes, properly finding utterances involves more than punctuation and this script is a very good start, it's just that ideally it would be able to give the index of the...
also just a small detail, for the sake of completeness in the sentencify script, some titles in the Tengyur end with `བཞུགས` (without the `སོ`), perhaps other sentences too. maybe...
Hi Robin, thanks a lot for your email! I think we exchanged a few emails a few years ago, Edward introduced us IIRC... this repo is a manual cleanup of...
Thanks! Ideally it should save without error (it does with the older setting)
yes, Pillow 9.5.0 / libtiff 4.5.0
Wonderful, thanks a lot! My test with the older version was on Debian 10 but it probably doesn't matter anymore