MatchZoo
MatchZoo copied to clipboard
Loading word2vec embedding exceeds the memory limit
Describe the bug
Loading word2vec embedding causes the memory issue. Loading embedding vector in string format require much more memory.
Solution
Modify the function matchzoo.embedding.load_from_file
from:
data = pd.read_csv(file_path, sep=" ", index_col=0, header=None, skiprows=1)
to:
data = pd.read_csv(file_path, sep=" ", index_col=0, header=None, skiprows=1, quoting=csv.QUOTE_NONE)
would you like to send a PR to fix this issue? @danielwonght
Sure.