Results 2 issues of Jun

Traceback (most recent call last): File "inference.py", line 3, in from preprocess_data import preprocess_batch File "/data/jquan/codes/paraphraser-master/paraphraser/preprocess_data.py", line 26, in word_to_id, idx_to_word, embedding, start_id, end_id, unk_id, mask_id = load_sentence_embeddings() File "/data/jquan/codes/paraphraser-master/paraphraser/embeddings.py",...

Is the tokenizer implementation the same as python version? It seems that it can not cut the Korean text into correct wordpiece.