tf_seq2seq_chatbot icon indicating copy to clipboard operation
tf_seq2seq_chatbot copied to clipboard

Error when run "python train.py"

Open minhntm opened this issue 8 years ago • 15 comments

Preparing dialog data in /var/lib/tf_seq2seq_chatbot/data Creating vocabulary /var/lib/tf_seq2seq_chatbot/data/vocab20000.in from data /var/lib/tf_seq2seq_chatbot/data/chat.in Traceback (most recent call last): File "train.py", line 15, in tf.app.run() File "/home/minh/opt/anaconda3/envs/ml/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 30, in run sys.exit(main(sys.argv)) File "train.py", line 12, in main train() File "/home/minh/source/machine-learing/chatbot/tf_seq2seq_chatbot/tf_seq2seq_chatbot/lib/train.py", line 22, in train train_data, dev_data, _ = data_utils.prepare_dialog_data(FLAGS.data_dir, FLAGS.vocab_size) File "/home/minh/source/machine-learing/chatbot/tf_seq2seq_chatbot/tf_seq2seq_chatbot/lib/data_utils.py", line 200, in prepare_dialog_data create_vocabulary(vocab_path, train_path + ".in", vocabulary_size) File "/home/minh/source/machine-learing/chatbot/tf_seq2seq_chatbot/tf_seq2seq_chatbot/lib/data_utils.py", line 70, in create_vocabulary for line in f: File "/home/minh/opt/anaconda3/envs/ml/lib/python3.5/site-packages/tensorflow/python/platform/gfile.py", line 176, in next return next(self._fp) File "/home/minh/opt/anaconda3/envs/ml/lib/python3.5/codecs.py", line 321, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xad in position 2329: invalid start byte

minhntm avatar Dec 08 '16 08:12 minhntm

Hi Minhbk,

I'm getting the same error, it's just as the message says. You're running into encoding problems. That character (soft hyphen apparently) isn't in the utf-8 encoding set.

Try opening with a different encoding if you can, otherwise modify the input text to use standard hyphens.

tjrileywisc avatar Jan 01 '17 17:01 tjrileywisc

I run this code with python 2.7, and it work! I don't understand why it work :D

minhntm avatar Jan 02 '17 06:01 minhntm

Solution: Remove or replace all non-ascii characters from train data. On Windows OS:

  1. Install Notepad++ from here: https://notepad-plus-plus.org/download/v7.3.html

  2. Open chat.in in Notepad++ and do follow:

  3. Ctrl+F

  4. Go to "Replace" tab

  5. Change search mode to "Regular Expression"

  6. Paste this regex for Non-ASCII Characters into "Find what :" field: [^\x00-\x7F]+

  7. Leave field "Replace with : " empty

  8. Push "Replace All"

  9. Also, you can try Encoding -> Convert to ANSI and then Save

enjoy an be ready for loooooong time of training =)

Problem is that dataset has non ASCII characters (about 3k times ) such as 0xAD(some short -) 0x97 (long -) 0x00AD in Unicode: http://www.fileformat.info/info/unicode/char/ad/index.htm

Works on Win 10, GPU TensorFlow and Python 3.5.

oradomskyi avatar Jan 05 '17 14:01 oradomskyi

@FrayaMiner Thanks for your response. Since you are using the same OS configuration what I am using. I have tested the windows10 + GPU + python 3.5 configuration on example tensorflow model and its working fantastic. Could you please help me with the initial steps to run this chat-bot module. I am getting the following error while executing the code.

(tensorflow-gpu) C:\Users\user1>python C:\Users\user1\Downloads\tf_seq2seq_chatbot\tf_seq2seq_chatbot\train.py I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library cublas64_80.dll locally I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library cudnn64_5.dll locally I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library cufft64_80.dll locally I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library nvcuda.dll locally I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library curand64_80.dll locally Preparing dialog data in /var/lib/tf_seq2seq_chatbot/data Creating vocabulary /var/lib/tf_seq2seq_chatbot/data\vocab20000.in from data /var/lib/tf_seq2seq_chatbot/data\chat.in Traceback (most recent call last): File "C:\Users\user1\Downloads\tf_seq2seq_chatbot\tf_seq2seq_chatbot\train.py", line 15, in tf.app.run() File "C:\Users\user1\Anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\platform\app.py", line 43, in run sys.exit(main(sys.argv[:1] + flags_passthrough)) File "C:\Users\user1\Downloads\tf_seq2seq_chatbot\tf_seq2seq_chatbot\train.py", line 12, in main train() File "C:\Users\user1\Downloads\tf_seq2seq_chatbot\tf_seq2seq_chatbot\tf_seq2seq_chatbot\lib\train.py", line 22, in train train_data, dev_data, _ = data_utils.prepare_dialog_data(FLAGS.data_dir, FLAGS.vocab_size) File "C:\Users\user1\Downloads\tf_seq2seq_chatbot\tf_seq2seq_chatbot\tf_seq2seq_chatbot\lib\data_utils.py", line 200, in prepare_dialog_data create_vocabulary(vocab_path, train_path + ".in", vocabulary_size) File "C:\Users\user1\Downloads\tf_seq2seq_chatbot\tf_seq2seq_chatbot\tf_seq2seq_chatbot\lib\data_utils.py", line 70, in create_vocabulary for line in f: File "C:\Users\user1\Anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 162, in next return self.next() File "C:\Users\user1\Anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 156, in next retval = self.readline() File "C:\Users\user1\Anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 123, in readline self._preread_check() File "C:\Users\user1\Anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 73, in _preread_check compat.as_bytes(self.__name), 1024 * 512, status) File "C:\Users\user1\Anaconda3\envs\tensorflow-gpu\lib\contextlib.py", line 66, in exit next(self.gen) File "C:\Users\user1\Anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 469, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: /var/lib/tf_seq2seq_chatbot/data\chat.in : The system cannot find the path specified.

dsblr avatar Jan 09 '17 11:01 dsblr

@dsblr Solution:

  1. Copypaste file "tf_seq2seq_chatbot/data/train/movie_lines_selected.txt"

  2. Rename it into "chat.in"

  3. In ../tf_seq2seq_chatbot/configs/config.py Specify absolute path to "chat.in" using Windows style with special characters enclosing (dash '\' is special and need to be enclosed), for example: "D:\\tf_seq2seq_chatbot\\tf_seq2seq_chatbot\\data\\train\\" or "D:\\tf_seq2seq_chatbot\\tf_seq2seq_chatbot\\data\\train" (I'm not sure here)

  4. Follow steps from my comment above, to clean "chat.in" from Non-ASCII characters

oradomskyi avatar Jan 10 '17 15:01 oradomskyi

@FrayaMiner Thank you for your response. I am now getting the following error. UnicodeDecodeError: 'utf-8' codec can't decode byte 0xad in position 2329: invalid start byte I have cleanse the chat.in file as per the step mentioned by you. Please help to resolve this.

dsblr avatar Jan 12 '17 05:01 dsblr

@dsblr you can try manually search and clean file, but I prefer cleaning with regex. also, you can try Encoding -> Convert to ANSI and then save

character 0xad at 2329 is a non-ascii "soft hyphen" http://www.fileformat.info/info/unicode/char/ad/index.htm

oradomskyi avatar Jan 12 '17 08:01 oradomskyi

@FrayaMiner thanks once again for your response. I tried to remove the soft hyphen manually but the errpr still presist . I beleive few of the soft hype I might have missed. Could you please help me with the cleansed chat file which you are using ? It will be so helpful.

Thanks

dsblr avatar Jan 12 '17 13:01 dsblr

@FrayaMiner is there a way to solve the following error. I am getting this while running the test.py "C:\Work\tf_seq2seq_chatbot\tf_seq2seq_chatbot\lib\seq2seq_model_utils.py", line 43, in get_predicted_sentence bucket_id = min([b for b in xrange(len(BUCKETS)) if BUCKETS[b][0] > len(input_token_ids)]) NameError: name 'xrange' is not defined

The above error solved after importing xrange into seq2seq_model_utils.py.

But while executing chat.py, another error I am getting :

hello Traceback (most recent call last): File "chat.py", line 15, in tf.app.run() File "D:\Users\user1\chatbots\tensorflow\softwares\anaconda\lib\site-packages \tensorflow\python\platform\app.py", line 43, in run sys.exit(main(sys.argv[:1] + flags_passthrough)) File "chat.py", line 12, in main chat() File "C:\Work\tf_seq2seq_chatbot\tf_seq2seq_chatbot\lib\chat. py", line 27, in chat predicted_sentence = get_predicted_sentence(sentence, vocab, rev_vocab, mode l, sess) File "C:\Work\tf_seq2seq_chatbot\tf_seq2seq_chatbot\lib\seq2s eq_model_utils.py", line 65, in get_predicted_sentence output_sentence = ' '.join([rev_vocab[output] for output in outputs]) File "C:\Work\tf_seq2seq_chatbot\tf_seq2seq_chatbot\lib\seq2s eq_model_utils.py", line 65, in output_sentence = ' '.join([rev_vocab[output] for output in outputs]) IndexError: list index out of range

dsblr avatar Mar 21 '17 08:03 dsblr

Guys, Any idea why I am getting the error below. I have the file ssd_mobilenet_v1_pets.config in the folder but it says something like he cannot open it without specify the reason why it cannot find the file. I am working on windows, python 3.5.3, tensorflow latest version

Traceback (most recent call last): File "C:\Program Files (x86)\Python 3.5.2\tensorflow\my codes\objectDetectionOwnTrained_Flowers\train.py", line 202, in tf.app.run() File "C:\Program Files (x86)\Python 3.5.2\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "C:\Program Files (x86)\Python 3.5.2\tensorflow\my codes\objectDetectionOwnTrained_Flowers\train.py", line 147, in main model_config, train_config, input_config = get_configs_from_pipeline_file() File "C:\Program Files (x86)\Python 3.5.2\tensorflow\my codes\objectDetectionOwnTrained_Flowers\train.py", line 104, in get_configs_from_pipeline_file text_format.Merge(f.read(), pipeline_config) File "C:\Program Files (x86)\Python 3.5.2\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 118, in read self._preread_check() File "C:\Program Files (x86)\Python 3.5.2\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 78, in _preread_check compat.as_bytes(self.__name), 1024 * 512, status) File "C:\Program Files (x86)\Python 3.5.2\lib\contextlib.py", line 66, in exit next(self.gen) File "C:\Program Files (x86)\Python 3.5.2\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: ssd_mobilenet_v1_pets.config : �t�Χ䤣�����w���ɮסC

fastlater avatar Aug 28 '17 03:08 fastlater

are you slove the problem

danwenxuan avatar Sep 25 '17 12:09 danwenxuan

Can anyone solve the below errors?

Traceback (most recent call last): File "export_inference_graph.py", line 106, in tf.app.run() File "/home/ameya/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "export_inference_graph.py", line 99, in main text_format.Merge(f.read(), pipeline_config) File "/home/ameya/anaconda3/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py", line 118, in read self._preread_check() File "/home/ameya/anaconda3/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py", line 78, in _preread_check compat.as_bytes(self.__name), 1024 * 512, status) File "/home/ameya/anaconda3/lib/python3.6/contextlib.py", line 89, in exit next(self.gen) File "/home/ameya/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors_impl.NotFoundError: training/to/ssd_inception_v2.config

ameyakale603 avatar Sep 26 '17 18:09 ameyakale603

File "train.py", line 184, in tf.app.run() File "C:\Users\Robin\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run _sys.exit(main(argv)) File "train.py", line 180, in main graph_hook_fn=graph_rewriter_fn) File "C:\TensorFlow\models\research\object_detection\trainer.py", line 274, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "C:\TensorFlow\models\research\object_detection\trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "train.py", line 121, in get_next dataset_builder.build(config)).get_next() File "C:\TensorFlow\models\research\object_detection\builders\dataset_builder.py", line 176, in build num_additional_channels=num_additional_channels) File "C:\TensorFlow\models\research\object_detection\data_decoders\tf_example_decoder.py", line 267, in init use_display_name) File "C:\TensorFlow\models\research\object_detection\utils\label_map_util.py", line 152, in get_label_map_dict label_map = load_labelmap(label_map_path) File "C:\TensorFlow\models\research\object_detection\utils\label_map_util.py", line 132, in load_labelmap label_map_string = fid.read() File "C:\Users\Robin\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 120, in read self._preread_check() File "C:\Users\Robin\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 80, in _preread_check compat.as_bytes(self.__name), 1024 * 512, status) File "C:\Users\Robin\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 519, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: C:/tensorflow1/models/research/object_detection/training/labelmap.pbtxt : The system cannot find the path specified. ; No such process

C:\TensorFlow\models\research\object_detection>

Rabin-Rai avatar Jun 23 '18 17:06 Rabin-Rai

Hello, I want to ask if you have solved the problem

kaaier avatar Dec 14 '18 13:12 kaaier

I am very worry, do you solve it/label_map??????????????/ PBTXT: ϵ ͳ \ udcd5 Ҳ \ udcbb \ udcb5 \ udcbd ָ \ udcb6 \ udca8 \ udcb5 \ udcc4 · \ udcbe \ udcb6 \ udca1 \ udca3

kaaier avatar Dec 14 '18 13:12 kaaier