chatbot
chatbot copied to clipboard
TypeError: cannot use a bytes pattern on a string-like object
/usr/bin/python3.5 /home/scrooge/chatbot/seqGanChatbot/execute.py
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:493: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:494: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:495: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:496: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:497: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:502: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
2019-11-29 03:38:04.465325: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
Preparing Chitchat gen_data in ./gen_data/
Tokenizing disc_data in ./gen_data/train.answer
Traceback (most recent call last):
File "/home/scrooge/chatbot/seqGanChatbot/execute.py", line 355, in
Process finished with exit code 1
The spot where the issue occurs:
def basic_tokenizer(sentence): """Very basic tokenizer: split the sentence into a list of tokens.""" words = [] #sentence = tf.compat.as_bytes(sentence) for space_separated_fragment in sentence.strip().split(): words.extend(re.split(_WORD_SPLIT, space_separated_fragment)) return [w for w in words if w]
in data_utils.py
Runtime Environment: Ubuntu 16.04 Python 3.5 TensorFlow 1.5.0 PyCharm Community data were downloaded from the link of baidunetdisk given in README.MD
I found a very similar repo about SeqGAN: https://github.com/vpegasus/seqGan_chatbot/blob/master/utils/data_utils.py
and from his code I found the inexistence of regular expression part of this function:
def basic_tokenizer(sentence): """Very basic tokenizer: split the sentence into a list of tokens.""" words = [] for space_separated_fragment in sentence.strip().split(): words.extend(_WORD_SPLIT.split(space_separated_fragment)) return [w for w in words if w]
I haven't test his code yet, I am wondering how this removal make sense...