jumanpp
jumanpp copied to clipboard
Training API
I would like to train the model on a different dataset. Does it have any training API that I could call?
You need not simply a dataset, but a segmentation dictionary and annotated corpus. We need to release our segmentation dictionary, but there is almost no documentation on how to use it.
For training, if you plan to use Jumandic for segmentation, you need only corpus and can use <build_dir>/src/jumandic/jpp_train_jumandic binary for training a model. This process needs to be documented.
If not, please follow https://github.com/eiennohito/jumanpp-t9 on how to use Juman++ for your dataset/segmentation standard.