Botok icon indicating copy to clipboard operation
Botok copied to clipboard

Missing pos for PUNCT

Open 10zinten opened this issue 3 years ago • 0 comments

System:

  • botok: v0.8.8

Reproduce

tokens = wt.tokenize("༄༅། །བློ་སྦྱོང་དོན་?")
print(tokens[0])

Output

text: "༄༅། །"
char_types: |NORMAL_PUNCT|NORMAL_PUNCT|NORMAL_PUNCT|TRANSPARENT|NORMAL_PUNCT|
chunk_type: PUNCT
start: 0
len: 5

Expected output:

text: "༄༅། །"
char_types: |NORMAL_PUNCT|NORMAL_PUNCT|NORMAL_PUNCT|TRANSPARENT|NORMAL_PUNCT|
chunk_type: PUNCT
pos: PUNCT
start: 0
len: 5

10zinten avatar May 09 '22 10:05 10zinten