xfspell
xfspell copied to clipboard
Training own model
Hi, super interesting work, and thanks for sharing!
I'm wondering if you could include a training script in the repository that would allow one to train one's own model. :) Additionally, could you say something about how long training took (and on what hardware).
+following
Hi, super interesting work, and thanks for sharing!
I'm wondering if you could include a training script in the repository that would allow one to train one's own model. :) Additionally, could you say something about how long training took (and on what hardware).
+1
Hi, super interesting work, and thanks for sharing!
I'm wondering if you could include a training script in the repository that would allow one to train one's own model. :) Additionally, could you say something about how long training took (and on what hardware).
maybe you can train your own model following by the url. http://www.realworldnlpbook.com/blog/unreasonable-effectiveness-of-transformer-spell-checker.html
@mhagiwara How do I create my own token file (.tok) with my dataset. I have a dataset of 20lakh food item names and I want to train a model to correct the food item names. Your blogpost describes the training process but I am confused about how do I create .tok file.
@mhagiwara I was able to train the model with my own dataset using xfspell architecture. But now when I try to do inference, I am getting an error in fairseq.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/fairseq/checkpoint_utils.py", line 151, in load_checkpoint_to_cpu
from fairseq.fb_pathmgr import fb_pathmgr
ModuleNotFoundError: No module named 'fairseq.fb_pathmgr'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/fairseq-interactive", line 8, in <module>
sys.exit(cli_main())
File "/usr/local/lib/python3.6/dist-packages/fairseq_cli/interactive.py", line 190, in cli_main
main(args)
File "/usr/local/lib/python3.6/dist-packages/fairseq_cli/interactive.py", line 82, in main
task=task,
File "/usr/local/lib/python3.6/dist-packages/fairseq/checkpoint_utils.py", line 179, in load_model_ensemble
ensemble, args, _task = load_model_ensemble_and_task(filenames, arg_overrides, task)
File "/usr/local/lib/python3.6/dist-packages/fairseq/checkpoint_utils.py", line 190, in load_model_ensemble_and_task
state = load_checkpoint_to_cpu(filename, arg_overrides)
File "/usr/local/lib/python3.6/dist-packages/fairseq/checkpoint_utils.py", line 160, in load_checkpoint_to_cpu
path, map_location=lambda s, l: default_restore_location(s, "cpu")
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 577, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 241, in __init__
super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: [enforce fail at inline_container.cc:144] . PytorchStreamReader failed reading zip archive: failed finding central directory
Please kindly help me as I am unable to resolve