tensor2tensor icon indicating copy to clipboard operation
tensor2tensor copied to clipboard

Quote and single quote are not handled correctly in vocab file where words are not wrapped in quotes

Open hepaajan opened this issue 5 years ago • 0 comments

Especially following branch will remove the quote so that it becomes empty string (as single quote character starts and ends with quote):

https://github.com/tensorflow/tensor2tensor/blob/5f9dd2db6d7797162e53adf152310ed13e9fc711/tensor2tensor/data_generators/text_encoder.py#L929

easy fix is the check also that "len(s) > 1" in both conditions

hepaajan avatar Oct 27 '20 07:10 hepaajan