dgl-ke
dgl-ke copied to clipboard
KGE can not work with python 3.8 + pytorch 1.7
@Hi,
I am trying to run the example from the readme as follows:
DGLBACKEND=pytorch dglke_train --model_name TransE_l2 --dataset FB15k --batch_size 1000 \ --neg_sample_size 200 --hidden_dim 400 --gamma 19.9 --lr 0.25 --max_step 500 --log_interval 100 \ --batch_size_eval 16 -adv --regularization_coef 1.00E-09 --test --num_thread 1 --num_proc 8
I have the following versions of packages:
- python 3.8.6
- torch 1.70
- dgl 0.4.3
- dglke 0.1.1
However the command crashes for me with the following error:
File "/Users/redacted/Venvs/kg/bin/dglke_train", line 8, in <module>
sys.exit(main())
File "/Users/redacted/Venvs/kg/lib/python3.8/site-packages/dglke/train.py", line 271, in main
proc.start()
File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'TransEScore.create_neg.<locals>.fn'
Can you try downgrade the version of python and pytorch? In our testing env, we use python3.6 and pytorch 1.6
Thanks - I was actually able to get this to work by using the versions you mentioned! Might it be worth adding this into the readme somewhere that these versions are required?
Yes, we need to make kge work with pytorch 1.7
Hey, I am running this with my virtual environment with below, still same issue occurring, any help on this for me?
import subprocess
subprocess.run(["dglke_train","--model_name", "TransE_l2",\
"--batch_size", "1000",\
"--neg_sample_size", "1", \
"--hidden_dim", "4", \
"--gamma", "19.9", \
"--lr", "0.25", \
"--max_step", "30", \
"--log_interval", "10", \
"--batch_size_eval", "16", \
"-adv", \
"--regularization_coef", "1.00E-09", \
"--save_path", "./data", \
"--data_path", "./dataset/", \
"--format", "raw_udd_hrt", \
"--data_files", "train.tsv", \
"--dataset", "xxx",\
"--neg_sample_size_eval", "10000",\
"--num_thread", "1",\
"--num_proc", "4"], stdout=subprocess.PIPE)
------ ERROR---
Traceback (most recent call last):
File "C:\Users\umitozm\AppData\Local\Programs\Python\Python36\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "C:\Users\umitozm\AppData\Local\Programs\Python\Python36\lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "c:\Users\umitozm\Recipe-Network\venv\Scripts\dglke_train.exe_main.py", line 7, in
certifi 2021.5.30 chardet 4.0.0 decorator 4.4.2 dgl 0.4.3 dglke 0.1.1 future 0.18.2 idna 2.10 networkx 2.5.1 numpy 1.19.5 Pillow 8.2.0 pip 21.1.3 requests 2.25.1 scipy 1.5.4 setuptools 57.0.0 torch 1.6.0+cpu torchvision 0.7.0+cpu urllib3 1.26.6 wheel 0.36.2
for above error, I changed "--num_proc", "4" to "--num_proc", "1" looking at internet for _winapi issue, hopefully, it would help someone else as well.
for above error, I changed "--num_proc", "4" to "--num_proc", "1" looking at internet for _winapi issue, hopefully, it would help someone else as well.
This just solved the problem for me as well. I wonder if this should be a separate issue?