dgl-lifesci
dgl-lifesci copied to clipboard
Problem loading train_set of rexgen_direct example for local training
Hi! I'm trying to train the rexgen model in https://github.com/awslabs/dgl-lifesci/tree/master/examples/reaction_prediction/rexgen_direct but while loading the USPTO data I'm getting a pickle problem as can be seen below:
from dgllife.data import USPTOCenter, WLNCenterDataset
train_set = USPTOCenter('train', num_processes=2)
Preparing train subset of USPTO for reaction center prediction.
Exception in thread Thread-9:
Traceback (most recent call last):
File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/usr/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.8/multiprocessing/pool.py", line 576, in _handle_results
task = get()
File "/usr/lib/python3.8/multiprocessing/connection.py", line 251, in recv
return _ForkingPickler.loads(buf.getbuffer())
RuntimeError: invalid value in pickle
This error doesn't happen while loading val and test sets though. Below is my libs versions:
rdkit-pypi==2021.9.4
dgl-cu113==0.7.2
dgllife==0.2.9
I've not encountered that before. Can you try using num_processes=1
?
It leads to
>>> train_set = USPTOCenter('train', num_processes=1)
Preparing train subset of USPTO for reaction center prediction.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/marcos/.local/lib/python3.8/site-packages/dgllife/data/uspto.py", line 661, in __init__
super(USPTOCenter, self).__init__(
File "/home/marcos/.local/lib/python3.8/site-packages/dgllife/data/uspto.py", line 461, in __init__
self.load_reaction_data(path_to_reaction_file, num_processes)
File "/home/marcos/.local/lib/python3.8/site-packages/dgllife/data/uspto.py", line 523, in load_reaction_data
mol, reaction, graph_edits = load_one_reaction(li)
File "/home/marcos/.local/lib/python3.8/site-packages/dgllife/data/uspto.py", line 319, in load_one_reaction
reaction, graph_edits = line.strip("\r\n ").split()
but after cleaning the downloading file from ~/.dgl/
and setting num_processes=1
it worked.
I realized that on the find_reaction_center_train.py
file the default argument for the number of processes if 4
parser.add_argument('-np', '--num-processes', type=int, default=4,
help='Number of processes to use for data pre-processing')
so running the script with the default arguments lead to this error
Thanks. This might be hardware-specific. Perhaps we should change the default value to 1 instead. Could you open a PR to change the default value?