mutli-sense-embedding icon indicating copy to clipboard operation
mutli-sense-embedding copied to clipboard

format of pre-trained embedding

Open makrai opened this issue 7 years ago • 1 comments

Could you please tell me what is the format for the pre-trained embedding? So far I've been trying the full word2vec format (with header), without the header, and finally only the weights (without the words) in a file, but I get exceptions like that below

Exception in thread "main" java.lang.NumberFormatException: For input string: "0.004003 0.004419 -0.003830 -0.003278 0.001367 0.003021 0.000941 0.000211 -0.003604 0.002218 -0.004356 0.001250 -0.000751 -0.000957 -0.003316 -0.001882 0.002579 0.003025 0.002969 0.001597 0.001545 -0.003803 -0.004096 0.004970 0.003801 0.003090 -0.000604 0.004016 -0.000495 0.000735 -0.000149 -0.002983 0.001312 -0.001337 -0.003825 0.004754 0.004379 -0.001095 -0.000226 0.000509 -0.003638 -0.004007 0.004555 0.000063 -0.002582 -0.003042 -0.003076 0.001697 0.000201 0.001331 -0.004214 -0.003808 -0.000130 0.001144 0.002550 -0.003170 0.004080 0.000927 0.001120 -0.000608 0.002986 -0.002288 -0.002097 0.002158 -0.000753 0.001031 0.001805 -0.004089 -0.001983 0.002914 0.004232 0.003932 -0.003047 -0.002108 -0.000909 0.002001 -0.003788 0.002998 0.002788 -0.001599 -0.001552 -0.002238 0.004229 0.003912 -0.001180 0.004215 0.004820 0.001815 0.004983 -0.003111 -0.001532 -0.002107 -0.002907 0.002815 0.001579 0.000425 -0.002194 0.001524 0.003059 0.000194"
    at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
    at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110)
    at java.lang.Double.parseDouble(Double.java:538)
    at CRP_multi_sense.readvect(CRP_multi_sense.java:560)
    at CRP_multi_sense.main(CRP_multi_sense.java:66)

makrai avatar Dec 11 '17 08:12 makrai

@makrai Hi, may I ask if you ended up figuring out this issue?

b05102139 avatar Aug 01 '23 01:08 b05102139