sacremoses
sacremoses copied to clipboard
Is there a plan to have sent_tokenize in this library?
Thanks for the awesome work porting this in a separate library. Makes it a great choice for people looking at a light library for tokenization / detokenization.
Was wondering if there was a plan to port sent_tokenize? It's in the repo but looks deprecated?
Actually that sent_tokenize
is a can of worms thus the reluctance to complete the code =)
I'm a little pack these couple of days but let me see if I can sit down and hack up a new version of the sent_tokenize
and take into consideration other sentence tokenizers that are available in the while.
That'd be fantastic! :)