Sudachi icon indicating copy to clipboard operation
Sudachi copied to clipboard

User dictionary source File Creation from Token and POS mapped file

Open santosa-malika opened this issue 3 years ago • 2 comments

if there any Utility to generate the “User dictionary source File” from a raw file ,which has Sentence and its Tokens and POS Mapping for Each Token . I mean if we have Token and POS mapping , if there any easy way to generate the “User dictionary source File”

For Example , if we have a file as below , or any similar format, can we generate the “User dictionary source File”

image

santosa-malika avatar Feb 10 '22 06:02 santosa-malika

Do you want to implement an analyzer for Thai(?) only, or use it to analyzer mixed Thai(?)-Japanese data?

eiennohito avatar Feb 16 '22 01:02 eiennohito

I want only for Japanese , the Earlier Example what I given by mistake Thai language , below is the Japanese Example . i can have in any format , but i can have below parameter

  1. Raw Sentence and its Tokens and POS mapping

image

santosa-malika avatar Feb 16 '22 03:02 santosa-malika