DeepFRI
DeepFRI copied to clipboard
Incomprehension regarding data processing
Hello,
I have a few questions regarding the way you process the data.
-
In your code you seem to use nrPDB-GO_2019.06.18_train.txt and nrPDB-GO_2020.06.18_annot.tsv to build the training data, but in your data you only have nrPDB-GO_2019.06.18_annot.tsv, is it normal ?
-
I analyzed your results file (DeepCNN-MERGED_molecular_function_results.pckl, DeepCNN-MERGED_cellular_component_results.pckl), and the size of the test set is the same depending on the ontologies. However, in your Supplementary table, you say that the size of the test set differ between MF, BP, CC. Why ?
-
In your Supplementary Table, the train/val/test set have different sizes depending on MF, BP, CC. Shouldn't they have the same size ?