pragmatic_segmenter
pragmatic_segmenter copied to clipboard
Language support
Can you list all the supported languages? It would be helpful to know if I were to use this in a project.
Would you be interested in contributions to add new languages? We are working on a 300K sentences corpus in Portuguese with many segmentation errors. We may find more hard cases. What would be the guidelines for adding a new language?