CatE
CatE copied to clipboard
Malloc error
Hello,
I tried to run CatE on ~500,000 word corpus with 300D glove embedding and kept receiving this error:
-
On MAC: cate(16090,0x700002f98000) malloc: Incorrect checksum for freed object 0x7fc442608c98: probably modified after being freed.
-
On Linux: Error in `./src/cate': free(): invalid next size (fast): 0x000055d279a778f0
Do you have any suggestion?
Thanks for the awesome work by the way. I really enjoy your paper.
Hi,
Thanks for your interest in our paper and code! Could you please provide the following information to help me figure out the issue?
- Were you able to successfully run the code on the example dataset (nyt & yelp)?
- When using the 300D GloVe embedding, did you use that as the pre-trained embedding to load from?
- Did the error happen before the training started, or during training?
Thanks, Yu
Hello,
Thanks for the quick reply.
- I ran the code successfully on the example dataset with word2vec_100.txt. It also ran successfully on my own dataset with word2vec_100.txt. I haven't tried the 300D GloVe on the example dataset yet. Strangely, if I try maybe >10 times, there would be one time where it may run successfully with 300D embeddings on my dataset.
- I use -load-emb glove_emb_file.txt -size 300 to load GloVe as pretrained embedding.
- The error happens most often during the load embedding stage. There are also occasions where it happens during the pretraining epochs (<15% progress)
Thanks for helping out.
Hi,
Thanks for the info. I tried downloading the GloVe pre-trained embedding and adding the vocabulary size & embedding dimension to the first line (I assume you did this too) of the embedding file. Then I also encountered a segfault though it was different from the memory error you mentioned. I feel that I might need some more time to figure out what happened since I'm quite busy recently. In the meantime, maybe you can try to run the code without loading the pre-trained embeddings (they are not required, and I found that the code works fine without loading pre-trained embeddings). Alternatively, maybe you can try to load another type of pre-trained embedding (such as JoSE which also has 300D pre-trained embeddings)? Different types of pre-trained embeddings won't cause too much difference in the final results of CatE as long as they provide some good initializations.
I'll let you know once I figure out the issue.
Best, Yu