TDC icon indicating copy to clipboard operation
TDC copied to clipboard

The bug happens when loading the DUD-E and scPDB dataset

Open StefanIsSmart opened this issue 1 year ago • 2 comments

Describe the bug The bug happens when loading the DUD-E and scPDB dataset

To Reproduce Steps to reproduce the behavior:

  1. Just run the demo from your websites from tdc.generation import SBDD data = SBDD(name='dude')

Expected behavior Get the data object

Screenshots Found local copy for 1/2 file... Found local copy for 2/2 file... Done! Processing (this may take long)... 100%|██████████| 102/102 [07:33<00:00, 4.44s/it] processing done, 0/40490 fails

ValueError Traceback (most recent call last) /export/disk1/why/database/PL_interaciton_dataset/script/tmp.ipynb 单元格 1 in () 1 from tdc.generation import SBDD ----> 2 data = SBDD(name='dude')

File /export/disk3/why/software/Anaconda/conda/envs/RDKit/lib/python3.8/site-packages/tdc/generation/sbdd.py:44, in SBDD.init(self, name, path, print_stats, return_pocket, threshold, remove_protein_Hs, remove_ligand_Hs, keep_het, save) 42 protein, ligand = bi_distribution_dataset_load(name, path, multiple_molecule_dataset_names, return_pocket, threshold, remove_protein_Hs, remove_ligand_Hs, keep_het) 43 if save: ---> 44 np.savez(os.path.join(path, name + '.npz'), 45 protein_coord=protein['coord'], 46 protein_atom=protein['atom_type'], 47 ligand_coord=ligand['coord'], 48 ligand_atom=ligand['atom_type'], 49 ) 50 self.save = save 52 self.ligand = ligand

File <array_function internals>:200, in savez(*args, **kwargs)

File /export/disk3/why/software/Anaconda/conda/envs/RDKit/lib/python3.8/site-packages/numpy/lib/npyio.py:615, in savez(file, *args, **kwds) 531 @array_function_dispatch(_savez_dispatcher) 532 def savez(file, *args, **kwds): 533 """Save several arrays into a single file in uncompressed .npz format. 534 535 Provide arrays as keyword arguments to store them under the (...) 613 614 """ --> 615 _savez(file, args, kwds, False)

File /export/disk3/why/software/Anaconda/conda/envs/RDKit/lib/python3.8/site-packages/numpy/lib/npyio.py:716, in _savez(file, args, kwds, compress, allow_pickle, pickle_kwargs) 714 for key, val in namedict.items(): 715 fname = key + '.npy' --> 716 val = np.asanyarray(val) 717 # always force zip64, gh-10776 718 with zipf.open(fname, 'w', force_zip64=True) as fid:

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (40592,) + inhomogeneous part.

Environment:

  • OS: Linux
  • Python version: 3.8.13
  • TDC version: 0.3.8
  • Any other relevant information: None

Additional context None.

StefanIsSmart avatar Oct 27 '23 15:10 StefanIsSmart

Thanks for raising this issue! @yuanqidu could you help take a look - thanks!

kexinhuang12345 avatar Oct 30 '23 01:10 kexinhuang12345

@yuanqidu can you help with this one?

amva13 avatar May 01 '24 16:05 amva13