chapte 6.1.1 Character-level one-hot encoding: the code in the book is different from the code in github, which is right?

Open alance123 opened this issue 6 years ago • 1 comments

the code in the book: import string samples = ['The cat sat on the mat.', 'The dog ate my homework.'] characters = string.printable token_index = dict(zip(range(1, len(characters) + 1), characters)) max_length = 50 results = np.zeros((len(samples), max_length, max(token_index.keys()) + 1)) for i, sample in enumerate(samples): for j, character in enumerate(sample): index = token_index.get(character) results[i, j, index] = 1.

the code in github:

import string

samples = ['The cat sat on the mat.', 'The dog ate my homework.'] characters = string.printable # All printable ASCII characters. token_index = dict(zip(characters, range(1, len(characters) + 1)))

max_length = 50 results = np.zeros((len(samples), max_length, max(token_index.values()) + 1)) for i, sample in enumerate(samples): for j, character in enumerate(sample[:max_length]): index = token_index.get(character) results[i, j, index] = 1.

Mar 25 '19 12:03 alance123

I found this too. The code in the book is wrong. The dictionary should map each character to a number.

Aug 01 '19 23:08 crista