ngram2vec icon indicating copy to clipboard operation
ngram2vec copied to clipboard

type error

Open chesterkuo opened this issue 6 years ago • 4 comments

I'm trying to create uni_bi.sh with Chinese/utf8 word seg file, however always got following error. any idea ?

========== Traceback (most recent call last): File "ngram2vec/pairs2counts.py", line 109, in main() File "ngram2vec/pairs2counts.py", line 88, in main counts_file.write(str(old[0]) + " " + str(w) + " " + str(old[1][w]) + "\n") TypeError: write() argument 1 must be unicode, not str

chesterkuo avatar Jun 19 '18 14:06 chesterkuo

I think the error is related with character encoding difference between python2 and 3. Maybe using python2 could fix the problem?

zhezhaoa avatar Jun 20 '18 02:06 zhezhaoa

I am trying to run the word2vecf.py file from simplified file but got following error. Traceback (most recent call last): File "corpus2pairs.py", line 4, in from corpus2vocab import getNgram File "/home/shubham/Inovanttech/W2V/ngram2vec/ngram2vec/simplified/corpus2vocab.py", line 4, in from representations.matrix_serializer import save_count_vocabulary ModuleNotFoundError: No module named 'representations'

shubhamnagalwade avatar Oct 07 '18 07:10 shubhamnagalwade

I am sorry that I didn't try the codes in simplified file thoroughly. A simple solution is to add the save_count_vocabulary function in representations.matrix_serializer into corpus2vocab.py file. And then delete the line from representations.matrix_serializer import save_count_vocabulary

zhezhaoa avatar Oct 07 '18 13:10 zhezhaoa

okay. Thank you. I will do that .

On Sun, Oct 7, 2018 at 7:28 PM zhezhaoa [email protected] wrote:

[image: Boxbe] https://www.boxbe.com/overview This message is eligible for Automatic Cleanup! ([email protected]) Add cleanup rule https://www.boxbe.com/popup?url=https%3A%2F%2Fwww.boxbe.com%2Fcleanup%3Fkey%3DMpfBpyECRFXR%252Flbvcgqm1JprU%252FFrKzbM1XLIppvi7kw%253D%26token%3DAzyak3%252Bl3Ql4yifj5zu9KJMETRn0smwS6OzvebF%252FT%252Bm79cDA4%252Fh7Cx12te%252FjBl0pOMULCRWjedo%252FkJuihGbECFuDo0Wrrw5t9aFice%252FbpDFlrdX6twaH%252FCRC4eF0wpfV3dzqLo0tWJ6%252BArn7FZO9Wg%253D%253D&tc_serial=43920056991&tc_rand=2011228733&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001 | More info http://blog.boxbe.com/general/boxbe-automatic-cleanup?tc_serial=43920056991&tc_rand=2011228733&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001

I am sorry that I didn't try the codes in simplified file thoroughly. A simple solution is to add the save_count_vocabulary function in representations.matrix_serializer into corpus2vocab.py file. And then delete the line from representations.matrix_serializer import save_count_vocabulary

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/zhezhaoa/ngram2vec/issues/5#issuecomment-427653777, or mute the thread https://github.com/notifications/unsubscribe-auth/AjR1PkyY_ThcpSaj3xN1OgsDqzJ6eDpuks5uigIAgaJpZM4Utqhp .

shubhamnagalwade avatar Oct 07 '18 15:10 shubhamnagalwade