GLN
GLN copied to clipboard
segment fault
hi,excuse me i meet a new issue,when i train the model i meet another issue segment fault core dump would you update the new code,i have no idea to solve the problem
and more: i think GLN/gln/mods/mol_gnn/gnn_family/utils.py can update by replace cuda() to to(DEVICE) thanks a lot
could you please provide more details for the segfault?
./run_mf.sh: 行 60: 9301 段错误 (吐核)python ../main.py -gm $gm -fp_degree 2 -neg_sample $neg_sample -att_type $att_type -gnn_out $gnn_out -tpl_enc $tpl_enc -subg_enc $subg_enc -latent_dim $msg_dim -bn $bn -gen_method $gen -retro_during_train $retro -neg_num $neg_size -embed_dim $embed_dim -readout_agg_type $graph_agg -act_func $act -act_last True -max_lv $lv -dropbox $dropbox -data_name $data_name -save_dir $save_dir -tpl_name $tpl_name -f_atoms $dropbox/cooked_$data_name/atom_list.txt -iters_per_val 3000 -gpu 1 -topk 50 -beam_size 50 -num_parts 1
no other information, i think its not environment issue
are you able to run the test with existing model dumps?
and did you modify the script?
I use -gpu 0 in the script. Please try with the vanilla code and see if that works
get another issue gpu cuda error are ckpt file saved by gpu?
i use -gpu 1 ,and did you save the model by gpu 0, i run test script by error as follows:
Traceback (most recent call last):
File "main_test.py", line 139, in
yes it uses gpu by default. Please always use -gpu 0 in your script. If you want to change GPU, please use CUDA_VISIBLE_DEVICES instead
hi , i debug the code ,some error at GLN/gln/graph_logic/soft_logic.py line 29 jagged_forward graph_embed = graph_enc(list) no other information can you introduce your code in brief i can not find the error thanks
can you give a docker image? i think it will be useful
graph_enc is from another sub package in this repo.
Can you first try without GPU? Please take a look at this: https://discuss.pytorch.org/t/on-a-cpu-device-how-to-load-checkpoint-saved-on-gpu-device/349
to see how to load a gpu dump into cpu
hi, i debug the traing file and test file got the same error ,not cuda error would you introduce your code in brief ,thanks
If the error is happening in that line, you may double check the https://github.com/Hanjun-Dai/GLN/blob/master/gln/mods/mol_gnn/gnn_family/utils.py#L64
note that different graph nn implementation will override this function.