ROBIN
ROBIN copied to clipboard
It seems that the value of VSA_EState* was not generated well across the entire training data.
Thanks for publishing the good work. I ran the program to use ROBIN for our study and noticed something strange.
When I run the analysis using the provided file (Mordred_Test_Compounds_3D.csv), I get the results as stated in the paper, but when I run the analysis by generating descriptors directly from the sdf file, I get different results.
When I analyzed the generated files, I found that the VSA_EState* values were significantly different, as shown below, and in the provided files (Mordred_Test_Compounds_3D.csv, Mordred_ROBIN_RNA_Binder_3D.csv), the VSA_EState1~7 values are mostly 0. If you generate them yourself, these values will be present.
Here is the program I used
- rdkit : 2022.9.5
- mordred : 1.2.0
- tensorflow : 2.3.1
- scikit-learn : 1.0.2
- numpy : 1.18.5
- scipy : 1.9.3
$ cat Mordred_files cat Mordred_Test_Compounds_3D.csv | cut -d ',' -f 1,1561,1562,1563,1564,1565,1566,1567
name,VSA_EState1,VSA_EState2,VSA_EState3,VSA_EState4,VSA_EState5,VSA_EState6,VSA_EState7
ADQ,0.0,0.0,0.0,0.0,0.0,0.0,0.0
HIV TAR compound 4,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Ribocil-A,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Tetracycline,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Imatinib,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Ibrutinib,0.0,0.0,0.0,0.0,7.188619484542558,0.0,0.0
Lovastatin,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Nevirapine,0.0,0.0,0.0,0.0,0.0,0.0,0.0
$ cat Mordred_files cat Mordred_ROBIN_RNA_Binder_3D.csv | cut -d ',' -f 1,1561,1562,1563,1564,1565,1566,1567 | head -n 5
name,VSA_EState1,VSA_EState2,VSA_EState3,VSA_EState4,VSA_EState5,VSA_EState6,VSA_EState7
0054-0090,0.0,0.0,0.0,0.0,0.0,0.0,0.0
0096-0280,0.0,0.0,0.0,0.0,0.0,0.0,0.0
0109-0002,0.0,0.0,0.0,0.0,0.0,0.0,0.0
0109-0045,0.0,0.0,0.0,0.0,0.0,0.0,0.0