vecmap icon indicating copy to clipboard operation
vecmap copied to clipboard

Unicode error at line #31 in embeddings.py

Open sawan16 opened this issue 6 years ago • 3 comments
trafficstars

UnicodeEncodeError: 'utf-8' codec can't encode character '\udcf6' in position 0: surrogates not allowed

sawan16 avatar Mar 27 '19 15:03 sawan16

This obviously looks like an encoding problem, but I would need more details to know where it happens. Please report the full stack trace.

artetxem avatar Apr 16 '19 13:04 artetxem

Sometimes 'utf-8' encoding faces errors while encoding/decoding certain symbols or letters. In those cases, you can either try to ignore such errors by adding errors = 'ignore' with the encoding, or else maybe try some other specific encoding type like latin-1 or ISO-8859-1 for example. Hope this helps.

SouravDutta91 avatar Aug 12 '19 00:08 SouravDutta91

The input embed model is not in correct format. Use model.save_word2vec_format(filename) to save the fasttext or word2vec model.

suman101112 avatar Jan 15 '21 11:01 suman101112