character-bert
character-bert copied to clipboard
Printing character level vectors
Hi,
You're printing words and their embeddings using:
for token, embedding in zip(x, embeddings_for_x):
print(token, embedding)
How can I see each letter's vector?
Hi @ozturkoktay, CharacterBERT is actually a word-level model. So, although it looks at each word's characters, it generates word-level vectors. If you really like to look at character vectors the only way is to extract the character embedding layer. But note that the elements of this matrix are not really characters but utf-8 bytes. 😊
Hi @helboukkouri, How can I extract the character embedding layer? Can you please share a code example?