dgl-lifesci [Roadmap] Release Plan for 0.3

This post is used to list the development plan for the next release. Feel free to leave comments if you have any requirement.

Support average precision metric
Pre-trained models on benchmarks like MoleculeNet, Alchemy, QM9, etc
Better support for attention visualization
Visualization for learned molecular representations
Adjust learning rate and add gradient clipping for ogbl-ppa.
Add better support for feature selection

Jun 12 '20 07:06 mufeili

if xxx.txt.proc file is not correspond to the xxx.txt file, the xxx.txt.proc shou be generated again.

Jun 24 '20 09:06 autodataming

file 2.rxns

[O:1]=[C:2]([OH:3])[c:4]1[c:5]([Br:6])[cH:7][cH:8][cH:9][c:10]1[NH:11][C:12](=[O:13])[CH3:14]>>[O:1]=[C:2]([OH:3])[c:4]1[c:5]([Br:6])[cH:7][cH:8][cH:9][c:10]1[NH2:11]

run the command,

python find_reaction_center_eval.py --test-path  2.rxns -np 1

it report error:


dgl._ffi.base.DGLError: Expect number of features to match number of nodes (len(u)). Got 27 and 14 instead.

Jun 24 '20 09:06 autodataming

if xxx.txt.proc file is not correspond to the xxx.txt file, the xxx.txt.proc shou be generated again.

If we want to ensure that, we always need to compute graph edits from scratch. As a result, let's always generate that x.proc file from scratch. I've done that in PR #32 .

Jun 25 '20 05:06 mufeili

file 2.rxns

[O:1]=[C:2]([OH:3])[c:4]1[c:5]([Br:6])[cH:7][cH:8][cH:9][c:10]1[NH:11][C:12](=[O:13])[CH3:14]>>[O:1]=[C:2]([OH:3])[c:4]1[c:5]([Br:6])[cH:7][cH:8][cH:9][c:10]1[NH2:11]

run the command,

python find_reaction_center_eval.py --test-path  2.rxns -np 1

it report error:


dgl._ffi.base.DGLError: Expect number of features to match number of nodes (len(u)). Got 27 and 14 instead.

I guess you previously held some different reactions in 2.rxns and the script loads constructed DGLGraphs for those different reactions. I'm now changing the default behavior to constructing DGLGraphs from scratch in PR #32.

Jun 25 '20 05:06 mufeili

DGLGraphs file "test.bin"
rxn file "xxx.txt"
rxn process file "xxx.txt.proc"

it will be better if the base name of DGLGraph file is consistent with the rxn file.

test.bin -> xxx.txt.bin

Jun 28 '20 01:06 autodataming

DGLGraphs file "test.bin"

rxn file "xxx.txt"

rxn process file "xxx.txt.proc"

it will be better if the base name of DGLGraph file is consistent with the rxn file.

test.bin -> xxx.txt.bin

This shall be addressed in PR #35.

Jun 28 '20 06:06 mufeili

add debug mode!

In the debug mode, it will report what rxn raise the error.

run the command

python find_reaction_center_eval.py --test-path sin_map_clean.rxns   -np 1

Evaluation on the test set.
Traceback (most recent call last):
  File "find_reaction_center_eval.py", line 79, in <module>
    main(args)
  File "find_reaction_center_eval.py", line 47, in main
    args, args['top_ks_test'], model, test_loader, args['easy'])
  File "/home/NFS/user/zgong/czq/workflow_retro_deepsyn2/step3dgllifesci/dgl-lifesci/examples/reaction_prediction/rexgen_direct/utils.py", line 456, in reaction_center_final_eval
    for batch_id, batch_data in enumerate(data_loader):
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 385, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/dgllife/data/uspto.py", line 509, in __getitem__
    self.atom_pair_labels[item] = get_pair_label(mol, self.graph_edits[item])
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/dgllife/data/uspto.py", line 181, in get_pair_label
    labels[i, j, pair_to_changes[(j, i)]] = 1.
IndexError: index 62 is out of bounds for dimension 1 with size 62

obtain the head 100 rxns in the file sin_map_clean.rxns, it will not report error!

head -n 100 sin_map_clean.rxns > sin100.rxns
python find_reaction_center_eval.py --test-path sin100.rxns    -np 1

Jun 28 '20 08:06 autodataming

add debug mode!

In the debug mode, it will report what rxn raise the error.

run the command

python find_reaction_center_eval.py --test-path sin_map_clean.rxns   -np 1

Evaluation on the test set.
Traceback (most recent call last):
  File "find_reaction_center_eval.py", line 79, in <module>
    main(args)
  File "find_reaction_center_eval.py", line 47, in main
    args, args['top_ks_test'], model, test_loader, args['easy'])
  File "/home/NFS/user/zgong/czq/workflow_retro_deepsyn2/step3dgllifesci/dgl-lifesci/examples/reaction_prediction/rexgen_direct/utils.py", line 456, in reaction_center_final_eval
    for batch_id, batch_data in enumerate(data_loader):
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 385, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/dgllife/data/uspto.py", line 509, in __getitem__
    self.atom_pair_labels[item] = get_pair_label(mol, self.graph_edits[item])
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/dgllife/data/uspto.py", line 181, in get_pair_label
    labels[i, j, pair_to_changes[(j, i)]] = 1.
IndexError: index 62 is out of bounds for dimension 1 with size 62

obtain the head 100 rxns in the file sin_map_clean.rxns, it will not report error!

head -n 100 sin_map_clean.rxns > sin100.rxns
python find_reaction_center_eval.py --test-path sin100.rxns    -np 1

Can you provide a reaction that will yield the error? I want to use that for developing the feature you requested.

Jun 28 '20 13:06 mufeili

add debug mode!

In the debug mode, it will report what rxn raise the error.

run the command

python find_reaction_center_eval.py --test-path sin_map_clean.rxns   -np 1

Evaluation on the test set.
Traceback (most recent call last):
  File "find_reaction_center_eval.py", line 79, in <module>
    main(args)
  File "find_reaction_center_eval.py", line 47, in main
    args, args['top_ks_test'], model, test_loader, args['easy'])
  File "/home/NFS/user/zgong/czq/workflow_retro_deepsyn2/step3dgllifesci/dgl-lifesci/examples/reaction_prediction/rexgen_direct/utils.py", line 456, in reaction_center_final_eval
    for batch_id, batch_data in enumerate(data_loader):
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 385, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/dgllife/data/uspto.py", line 509, in __getitem__
    self.atom_pair_labels[item] = get_pair_label(mol, self.graph_edits[item])
  File "/home/zgong/nfs/program/anaconda2/envs/py36dgllifesci/lib/python3.6/site-packages/dgllife/data/uspto.py", line 181, in get_pair_label
    labels[i, j, pair_to_changes[(j, i)]] = 1.
IndexError: index 62 is out of bounds for dimension 1 with size 62

obtain the head 100 rxns in the file sin_map_clean.rxns, it will not report error!

head -n 100 sin_map_clean.rxns > sin100.rxns
python find_reaction_center_eval.py --test-path sin100.rxns    -np 1

This shall be addressed in PR #38 .

Jun 30 '20 18:06 mufeili

Just tried and I think the issue no longer exists with the master branch.

On Tue, Aug 25, 2020 at 12:03 PM summer-cola [email protected] wrote:

add debug mode!

run the command python classification_train.py -c XXX.csv -sc SMILES -t XXX -mo MPNN problems：

Traceback (most recent call last):

File "classification_train.py", line 218, in
main(args, exp_config, train_set, val_set, test_set)
File "classification_train.py", line 93, in main
run_a_train_epoch(args, epoch, model, train_loader, loss_criterion, optimizer)
File "classification_train.py", line 33, in run_a_train_epoch
logits = predict(args, model, bg)
File "/home/yuanyuan/dgl-lifesci/examples/property_prediction/csv_data_configuration/utils.py", line 329, in predict
edge_feats = bg.edata.pop('e').to(args['device'])
File "/home/yuanyuan/soft/anaconda3/lib/python3.7/_collections_abc.py", line 795, in pop
value = self[key]
File "/home/yuanyuan/soft/anaconda3/lib/python3.7/site-packages/dgl/view.py", line 128, in getitem
return self._graph.get_e_repr(self._edges)[key]
KeyError: 'e'

when predicting molecular properties -mo weave/attentivefp/MPNN ，the problem also exists.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/awslabs/dgl-lifesci/issues/18#issuecomment-679562857, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEVLQXDMGWVGAYURHQTJP4LSCMZ2LANCNFSM4N4CYRWA .

Aug 25 '20 08:08 mufeili

https://github.com/awslabs/dgl-lifesci/issues/18#issuecomment-679882211 Yes,it is working .Thanks

Aug 25 '20 09:08 summer-cola

dgl-lifesci dgl-lifesci copied to clipboard

[Roadmap] Release Plan for 0.3

dgl-lifesci
dgl-lifesci copied to clipboard