key-value-memory-networks icon indicating copy to clipboard operation
key-value-memory-networks copied to clipboard

tokenize function code in data_utils.py is incorrect

Open zpengc opened this issue 3 years ago • 0 comments

with the test intention that

>>> tokenize('Bob dropped the apple. Where is the apple?')
    ['Bob', 'dropped', 'the', 'apple', '.', 'Where', 'is', 'the', 'apple', '?']

we should write like this:

def tokenize(sent):
    return [x for x in re.findall(r"\w+(?:'\w+)?|[^\w\s]", sent)]

zpengc avatar Dec 05 '21 09:12 zpengc