DeeplyTough icon indicating copy to clipboard operation
DeeplyTough copied to clipboard

Vertex dataset lost some pocket PDB files

Open mylRalph opened this issue 9 months ago • 0 comments

Hi, glad to see your excellent work! I followed your code to preprocess TOUGH-M1 and Vertex for training and evaluation with --db_preprocessing set set 0, just trying to attain the same splits used in your work. However, I encountered some problems blew:

  1. I can't find corresponding pocket PDB files in several pocket paths, e.g. DeeplyTough/STRUCTURE_DATA_DIR/Vertex/4cmt/4cmt_site_2.pdb, DeeplyTough/STRUCTURE_DATA_DIR/Vertex/4a9t/4a9t_site_2.pdb , and DeeplyTough/STRUCTURE_DATA_DIR/Vertex/4anu/4anu_site_2.pdb, etc, which resulted in invalid 23,380 pocket pairs and didn't match the number in the paper (1,461,668 positive and 102,935 negative pocket pairs, 1,564,603 pairs in total).
  2. I got 6580 structures (unmatch with your result, 6548 structures) left for training after filtering TOUGH-M1 for the evaluation of Vertex with --db_exclude_vertex set 'seqclust', but the number of pocket pairs constructed from these 6580 TOUGH-M1 structures totally accorded with your result, 710,009 pairs, which made me very confused.

I would really appreciate it if I could get your help! Looking forward to your reply!

mylRalph avatar Sep 10 '23 10:09 mylRalph