torchdrug icon indicating copy to clipboard operation
torchdrug copied to clipboard

[Bug] `lazy=True` argument is not working in case of `USPTO50k` dataset

Open bhadreshpsavani opened this issue 2 years ago • 5 comments

from torchdrug import datasets, data

reaction_dataset = datasets.USPTO50k("~/molecule-datasets/",
                                     lazy=True,
                                     node_feature="center_identification",
                                     kekulize=True, )
synthon_dataset = datasets.USPTO50k("~/molecule-datasets/", as_synthon=True,
                                    node_feature="synthon_completion",
                                    lazy=True,
                                    kekulize=True)

will give below error

TypeError                                 Traceback (most recent call last)
[<ipython-input-6-5448f605ee48>](https://localhost:8080/#) in <module>()
      4                                      lazy=True,
      5                                      node_feature="center_identification",
----> 6                                      kekulize=True, )
      7 synthon_dataset = datasets.USPTO50k("~/molecule-datasets/", as_synthon=True,
      8                                     node_feature="synthon_completion",

<decorator-gen-225> in __init__(self, path, as_synthon, verbose, **kwargs)

4 frames
[/usr/local/lib/python3.7/dist-packages/torchdrug/core/core.py](https://localhost:8080/#) in wrapper(init, self, *args, **kwargs)
    286                 config.pop(k)
    287             self._config = dict(config)
--> 288             return init(self, *args, **kwargs)
    289 
    290         def get_function(method):

[/usr/local/lib/python3.7/dist-packages/torchdrug/datasets/uspto50k.py](https://localhost:8080/#) in __init__(self, path, as_synthon, verbose, **kwargs)
     61 
     62         self.load_csv(file_name, smiles_field="rxn_smiles", target_fields=self.target_fields, verbose=verbose,
---> 63                       **kwargs)
     64 
     65         if as_synthon:

[/usr/local/lib/python3.7/dist-packages/torchdrug/data/dataset.py](https://localhost:8080/#) in load_csv(self, csv_file, smiles_field, target_fields, verbose, **kwargs)
    111                         targets[field].append(value)
    112 
--> 113         self.load_smiles(smiles, targets, verbose=verbose, **kwargs)
    114 
    115     def _standarize_index(self, index, count):

[/usr/local/lib/python3.7/dist-packages/torchdrug/data/dataset.py](https://localhost:8080/#) in load_smiles(self, smiles_list, targets, transform, verbose, **kwargs)
    250                     logger.debug("Can't construct molecule from SMILES `%s`. Ignore this sample." % _smiles)
    251                     break
--> 252                 mol = data.Molecule.from_molecule(mol, **kwargs)
    253                 mols.append(mol)
    254             else:

[/usr/local/lib/python3.7/dist-packages/torchdrug/utils/decorator.py](https://localhost:8080/#) in wrapper(*args, **kwargs)
    113                     kwargs[value] = kwargs.pop(key)
    114 
--> 115             return func(*args, **kwargs)
    116 
    117         return wrapper

TypeError: from_molecule() got an unexpected keyword argument 'lazy'

bhadreshpsavani avatar Jul 28 '22 10:07 bhadreshpsavani

Where did you find that code? I see the one here https://torchdrug.ai/docs/tutorials/retrosynthesis.html#prepare-the-dataset but it is different and has no argument lazy at all. Moreover, the code from the link worked fine for me:)

DimGorr avatar Jul 28 '22 12:07 DimGorr

seems like it has already been solved https://github.com/DeepGraphLearning/torchdrug/pull/24

DimGorr avatar Jul 29 '22 08:07 DimGorr

Cool! Actually, i was trying it in colab and thought of using this argument! When we check the arguments like this,

datasets.USPTO50k?

it shows that lazy as an optional arguments but it was giving error

bhadreshpsavani avatar Jul 29 '22 08:07 bhadreshpsavani

Hi @DimGorr, It shows this docstring

Init signature: datasets.USPTO50k(*args, **kwargs)
Docstring:     
USPTO50k(path, as_synthon=False, verbose=1, transform=None, lazy=False, atom_feature='default', bond_feature='default', mol_feature=None, with_hydrogen=False, kekulize=False)

bhadreshpsavani avatar Jul 29 '22 08:07 bhadreshpsavani

Hi! The lazy operation isn't implemented for USPTO50k. The docstring is automatically generated due to its inheritance from data.MoleculeDataset.

KiddoZhu avatar Aug 14 '22 22:08 KiddoZhu