nmt
nmt copied to clipboard
How to add my custom database and train it on that ?
I have my own dataset for the chatbot, I want to train that data only. Can you please tell me what all changes do I need to make so that I can fulfill my requirements ?
you need these file:
- train.src, a file contains questions.
- train.tgt, a file contains responses.
- dev.src, a file contains questions.
- dev.tgt, a file contains responses.
- test.src, a file contains questions.
- test.tgt, a file contains responses.
- vocab.src, vocabs of all questions.
- vocab.tgt, vocabs of all responses.
If you already prepared these files, you can train it:
python -m nmt.nmt \
--out_dir=$YOUR_OUT_DIR \
--src=src --tgt=tgt \
--train_prefix=$FILE_PTH/train \
--dev_prefix=$FILE_PATH/dev \
--test_prefix=$FILE_PATH/test \
.. (other args)
The logout is clear, your vocab file contains empty line. You need to make sure:
- NO empty line(s) in the vocab files
- NO repeated words in the vocab files. You can write a simple python script to filter the empty line and the repeated words, and then try again.
On Sat, Jun 23, 2018 at 4:37 AM Shresth [email protected] wrote:
@luozhouyang https://github.com/luozhouyang Sir it is giving one problem, it giving a problem in the vocab files. Here what it is saying. Here's what the whole error is like. Even I check the whole vocab files, there is no space that might be present, so I don't know what this is now. But yeah before it was giving errors in the train, test and dev files itself and I sort them out. But it's gonna be like week I am really not able to do any progress in this. Please help me out. Thank you
Here the error:-
2018-06-23 02:01:11.183505: W tensorflow/core/framework/op_kernel.cc:1278] OP_REQUIRES failed at lookup_table_init_op.cc:145 : Invalid argument: Invalid content in /tmp/nmt_model/vocab.tgt: empty line found at position 149. 2018-06-23 02:01:11.183537: W tensorflow/core/framework/op_kernel.cc:1278] OP_REQUIRES failed at lookup_table_init_op.cc:145 : Invalid argument: Invalid content in /tmp/nmt_model/vocab.src: empty line found at position 77. Traceback (most recent call last): File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1330, in _do_call return fn(*args) File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1315, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1423, in _call_tf_sessionrun status, run_metadata) File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid content in /tmp/nmt_model/vocab.tgt: empty line found at position 149. [[Node: string_to_index_1/hash_table/table_init = InitializeTableFromTextFileV2[delimiter="\t", key_index=-2, value_index=-1, vocab_size=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](string_to_index_1/hash_table, string_to_index_1/hash_table/table_init/asset_filepath)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/shresth/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/shresth/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/shresth/Desktop/WEB/CHATBOT/nmt/nmt/nmt.py", line 605, in tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run _sys.exit(main(argv)) File "/home/shresth/Desktop/WEB/CHATBOT/nmt/nmt/nmt.py", line 598, in main run_main(FLAGS, default_hparams, train_fn, inference_fn) File "/home/shresth/Desktop/WEB/CHATBOT/nmt/nmt/nmt.py", line 591, in run_main train_fn(hparams, target_session=target_session) File "/home/shresth/Desktop/WEB/CHATBOT/nmt/nmt/train.py", line 328, in train train_model.model, model_dir, train_sess, "train") File "/home/shresth/Desktop/WEB/CHATBOT/nmt/nmt/model_helper.py", line 572, in create_or_load_model session.run(tf.tables_initializer()) File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 908, in run run_metadata_ptr) File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1143, in _run feed_dict_tensor, options, run_metadata) File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1324, in _do_run run_metadata) File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1343, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid content in /tmp/nmt_model/vocab.tgt: empty line found at position 149. [[Node: string_to_index_1/hash_table/table_init = InitializeTableFromTextFileV2[delimiter="\t", key_index=-2, value_index=-1, vocab_size=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](string_to_index_1/hash_table, string_to_index_1/hash_table/table_init/asset_filepath)]]
Caused by op 'string_to_index_1/hash_table/table_init', defined at: File "/home/shresth/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/shresth/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/shresth/Desktop/WEB/CHATBOT/nmt/nmt/nmt.py", line 605, in tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run _sys.exit(main(argv)) File "/home/shresth/Desktop/WEB/CHATBOT/nmt/nmt/nmt.py", line 598, in main run_main(FLAGS, default_hparams, train_fn, inference_fn) File "/home/shresth/Desktop/WEB/CHATBOT/nmt/nmt/nmt.py", line 591, in run_main train_fn(hparams, target_session=target_session) File "/home/shresth/Desktop/WEB/CHATBOT/nmt/nmt/train.py", line 296, in train train_model = model_helper.create_train_model(model_creator, hparams, scope) File "/home/shresth/Desktop/WEB/CHATBOT/nmt/nmt/model_helper.py", line 80, in create_train_model src_vocab_file, tgt_vocab_file, hparams.share_vocab) File "/home/shresth/Desktop/WEB/CHATBOT/nmt/nmt/utils/vocab_utils.py", line 87, in create_vocab_tables tgt_vocab_file, default_value=UNK_ID) File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/lookup_ops.py", line 999, in index_table_from_file init, default_value, shared_name=shared_name, name=hash_table_scope) File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/lookup_ops.py", line 279, in init super(HashTable, self).init(table_ref, default_value, initializer) File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/lookup_ops.py", line 171, in init self._init = initializer.initialize(self) File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/lookup_ops.py", line 520, in initialize name=scope) File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_lookup_ops.py", line 317, in initialize_table_from_text_file_v2 vocab_size=vocab_size, delimiter=delimiter, name=name) File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3327, in create_op op_def=op_def) File "/home/shresth/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1674, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Invalid content in /tmp/nmt_model/vocab.tgt: empty line found at position 149. [[Node: string_to_index_1/hash_table/table_init = InitializeTableFromTextFileV2[delimiter="\t", key_index=-2, value_index=-1, vocab_size=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](string_to_index_1/hash_table, string_to_index_1/hash_table/table_init/asset_filepath)]]
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/nmt/issues/321#issuecomment-399574722, or mute the thread https://github.com/notifications/unsubscribe-auth/AgdJn5o7t3PY71FLvmaf1COBSJ8osKXjks5t_VWcgaJpZM4TsKBU .
@luozhouyang Thank you sir this thing worked out, even now I trained the chatbot too. But now there is another error. When I run the model_test.py file, it is showing some failure. Here's the pic. I am using the command python -m nmt.model_test

@shresthpaul133 I am new to this topic....can you tell me how to make a train.src,train.tgt.....I have text in a .txt file