ipatok
ipatok copied to clipboard
additional test data for tokenization tests
I didn't check your data carefull, but this is what we use for lingpy's "ipa2tokens" function:
- https://github.com/lingpy/lingpy/blob/master/lingpy/tests/test_data/test_tokenization.tsv
You might want to check against those, as lingpy yields 100% tokenizations on them.