dgl-ke icon indicating copy to clipboard operation
dgl-ke copied to clipboard

KGE can not work with python 3.8 + pytorch 1.7

Open sbonner0 opened this issue 4 years ago • 6 comments

@Hi,

I am trying to run the example from the readme as follows:

DGLBACKEND=pytorch dglke_train --model_name TransE_l2 --dataset FB15k --batch_size 1000 \ --neg_sample_size 200 --hidden_dim 400 --gamma 19.9 --lr 0.25 --max_step 500 --log_interval 100 \ --batch_size_eval 16 -adv --regularization_coef 1.00E-09 --test --num_thread 1 --num_proc 8

I have the following versions of packages:

  • python 3.8.6
  • torch 1.70
  • dgl 0.4.3
  • dglke 0.1.1

However the command crashes for me with the following error:

  File "/Users/redacted/Venvs/kg/bin/dglke_train", line 8, in <module>
    sys.exit(main())
  File "/Users/redacted/Venvs/kg/lib/python3.8/site-packages/dglke/train.py", line 271, in main
    proc.start()
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'TransEScore.create_neg.<locals>.fn'

sbonner0 avatar Dec 02 '20 14:12 sbonner0

Can you try downgrade the version of python and pytorch? In our testing env, we use python3.6 and pytorch 1.6

classicsong avatar Dec 02 '20 15:12 classicsong

Thanks - I was actually able to get this to work by using the versions you mentioned! Might it be worth adding this into the readme somewhere that these versions are required?

sbonner0 avatar Dec 02 '20 16:12 sbonner0

Yes, we need to make kge work with pytorch 1.7

classicsong avatar Dec 03 '20 01:12 classicsong

Hey, I am running this with my virtual environment with below, still same issue occurring, any help on this for me?

import subprocess
subprocess.run(["dglke_train","--model_name", "TransE_l2",\
            "--batch_size", "1000",\
            "--neg_sample_size", "1", \
            "--hidden_dim", "4", \
            "--gamma", "19.9", \
            "--lr", "0.25", \
            "--max_step", "30", \
            "--log_interval", "10", \
            "--batch_size_eval", "16", \
            "-adv", \
            "--regularization_coef", "1.00E-09", \
            "--save_path", "./data", \
            "--data_path", "./dataset/", \
            "--format", "raw_udd_hrt", \
            "--data_files", "train.tsv", \
            "--dataset", "xxx",\
            "--neg_sample_size_eval", "10000",\
            "--num_thread", "1",\
            "--num_proc", "4"], stdout=subprocess.PIPE)

------ ERROR---

Traceback (most recent call last): File "C:\Users\umitozm\AppData\Local\Programs\Python\Python36\lib\runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "C:\Users\umitozm\AppData\Local\Programs\Python\Python36\lib\runpy.py", line 85, in run_code exec(code, run_globals) File "c:\Users\umitozm\Recipe-Network\venv\Scripts\dglke_train.exe_main.py", line 7, in proc.start() File "C:\Users\umitozm\AppData\Local\Programs\Python\Python36\lib\multiprocessing\process.py", line 105, in start self._popen = self._Popen(self) File "C:\Users\umitozm\AppData\Local\Programs\Python\Python36\lib\multiprocessing\context.py", line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "C:\Users\umitozm\AppData\Local\Programs\Python\Python36\lib\multiprocessing\context.py", line 322, in _Popen return Popen(process_obj) File "C:\Users\umitozm\AppData\Local\Programs\Python\Python36\lib\multiprocessing\popen_spawn_win32.py", line 65, in init reduction.dump(process_obj, to_child) File "C:\Users\umitozm\AppData\Local\Programs\Python\Python36\lib\multiprocessing\reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) AttributeError: Can't pickle local object 'TransEScore.create_neg..fn' Traceback (most recent call last): File "", line 1, in File "C:\Users\umitozm\AppData\Local\Programs\Python\Python36\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "C:\Users\umitozm\AppData\Local\Programs\Python\Python36\lib\multiprocessing\spawn.py", line 115, in _main self = reduction.pickle.load(from_parent) EOFError: Ran out of input (base) (venv) PS C:\Users\umitozm\Recipe-Network> pip list Package Version


certifi 2021.5.30 chardet 4.0.0 decorator 4.4.2 dgl 0.4.3 dglke 0.1.1 future 0.18.2 idna 2.10 networkx 2.5.1 numpy 1.19.5 Pillow 8.2.0 pip 21.1.3 requests 2.25.1 scipy 1.5.4 setuptools 57.0.0 torch 1.6.0+cpu torchvision 0.7.0+cpu urllib3 1.26.6 wheel 0.36.2

umitozmen avatar Jun 30 '21 10:06 umitozmen

for above error, I changed "--num_proc", "4" to "--num_proc", "1" looking at internet for _winapi issue, hopefully, it would help someone else as well.

umitozmen avatar Jun 30 '21 11:06 umitozmen

for above error, I changed "--num_proc", "4" to "--num_proc", "1" looking at internet for _winapi issue, hopefully, it would help someone else as well.

This just solved the problem for me as well. I wonder if this should be a separate issue?

aridf avatar Nov 01 '21 14:11 aridf