dpdp icon indicating copy to clipboard operation
dpdp copied to clipboard

core dumped, tcmalloc: large alloc

Open zhouliang-yu opened this issue 2 years ago • 1 comments

Hey @wouterkool Thank you so much for sharing the code. There is some issue I faced that I cannot run the evaluation successfully.

I installed the required package as:

!conda install tqdm -y
!pip install gdown
!conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch -y
!pip install tensorboardx==1.5 fastprogress==0.1.18
!pip install cupy-cuda102
!pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-1.10.0+cu102.html

then I tried to generate the heatmap by:

!python export_heatmap.py --problem tsp --checkpoint logs/tsp100/best_val_checkpoint.tar --instances data/tsp/tsp100_validation_seed4321.pkl -f --batch_size 50 --no_prepwrap

However, there is a bug saying that:

tcmalloc: large alloc 140329253576704 bytes == (nil) @  0x7fa0f450d887 0x7fa0f33e7269 0x7fa0f33e80d2 0x7fa0f33e88fc 0x7fa0cd80ac78 0x7fa0cd80eb38 0x7fa05f7679aa 0x7fa05f763352 0x7fa0f47588d3 0x7fa0f475d39f 0x7fa0f388a16f 0x7fa0f475c96a 0x7fa0f42d4f96 0x7fa0f388a16f 0x7fa0f388a1ff 0x7fa0f42d5745 0x7fa0f42d5051 0x7fa0f2f89675 0x55c57b34e09d 0x55c57b34313f 0x55c57b3ed89f 0x55c57b3df0ff 0x55c57b3dfbc4 0x55c57b3cb698 0x55c57b342fa8 0x55c57b3eddd4 0x55c57b3df7e7 0x55c57b378b2e 0x55c57b3ed923 0x55c57b3de600 0x55c57b40b7d9

Then I try to use the pre-generated heatmap and try to run the evaluation:

!python eval.py data/tsp/tsp100_validation_seed4321.pkl --problem tsp --decode_strategy dpdp --score_function heatmap_potential --beam_size 100000 --heatmap_threshold 1e-5 --heatmap results/tsp/tsp100_validation_seed4321/heatmaps/heatmaps_tsp100.pkl

but I got a similar error:

tcmalloc: large alloc 139920501792768 bytes == (nil) @  0x7f41c8cf8887 0x7f41c7bd2269 0x7f41c7bd30d2 0x7f41c7bd38fc 0x7f41a1f5bc78 0x7f41a1f5fb38 0x7f4125b7e9aa 0x7f4125b7a352 0x7f41c8f438d3 0x7f41c8f4839f 0x7f41c807516f 0x7f41c8f4796a 0x7f41c8abff96 0x7f41c807516f 0x7f41c80751ff 0x7f41c8ac0745 0x7f41c8ac0051 0x7f41c762a675 0x5620417f609d 0x5620417eb13f 0x56204189589f 0x5620418870ff 0x562041887bc4 0x562041873698 0x5620417eafa8 0x562041895dd4 0x5620418877e7 0x562041820b2e 0x562041895923 0x562041886600 0x5620418b37d9

zhouliang-yu avatar Jun 25 '22 06:06 zhouliang-yu

In the beginning, I thought this might be caused because the size of the dataset is too big. Then I tried to use a way smaller dataset. But the issue still remains.

zhouliang-yu avatar Jun 25 '22 06:06 zhouliang-yu