deeprank icon indicating copy to clipboard operation
deeprank copied to clipboard

minor issues about hdf5 files

Open LilySnow opened this issue 7 years ago • 6 comments

  1. Maybe we should not call the pdb file of a model as "native" in the BM4 hdf5 files (e.g., 1E6E.hdf5) and call it "pdb" instead:

In [8]: list(f['1E6E_9w']) Out[8]: ['complex', 'features', 'features_raw', 'grid_points', 'mapped_features', 'native', 'targets']

  1. Shall we also put haddock score in the BM4 hdf5 files for easy comparison with haddock scoring function?

  2. In the final output data.hdf5 file, shall we also store the model IDs (currently it only contains target DockQ and predicted dockQ). The current version of data.hdf5 is not convenient for the comparison with other methods.

LilySnow avatar Mar 06 '18 10:03 LilySnow

'complex' is the pdb of the conformation (e.g. 1E6E_9w.pdb) and 'native' is the corresponding native conformation (e.g 1E6E.pdb). That being said we can rename all of that that very easily.

NicoRenaud avatar Mar 06 '18 13:03 NicoRenaud

Inside 1E6E.hdf5, we already have a folder for the native, as below, right? Should we then remove the "native" entry for all models inside 1E6E.hdf5, since they seem to be redundant?

In [6]: list(f['1E6E']) Out[6]: ['complex', 'features', 'features_raw', 'grid_points', 'mapped_features', 'native', 'targets']

LilySnow avatar Mar 06 '18 13:03 LilySnow

we can 'clean' the hdf5 file and remove all entries that are not needed But this is at the cost of possibly not being able to add new data to it

NicoRenaud avatar Mar 06 '18 14:03 NicoRenaud

Sorry, I do not understand. Why we have to have a cope of the native pdb file in each of the model file?

LilySnow avatar Mar 06 '18 14:03 LilySnow

It's needed to compute i-rmsd l-rmsd and dockQ. And for convenience it's stored there as well but can be removed if needed

NicoRenaud avatar Mar 06 '18 14:03 NicoRenaud

But the pdb file of the native is already in 1E6E.hdf5 as a separate entry, right?

LilySnow avatar Mar 06 '18 15:03 LilySnow