DeepPurpose icon indicating copy to clipboard operation
DeepPurpose copied to clipboard

Query Regarding MPNN Drug Encoder

Open SudhirGhandikota opened this issue 1 year ago • 3 comments

Hello,

Before I ask my query, I would like to congratulate everybody involved in building this great framework. I was trying to train an MPNN-CNN model on the latest BindingDB data and while exploring the existing MPNN Drug encoder implementation, I ran into a couple of doubts/queries. I was hoping to get some clarification from you guys.

From what I understand, the agraph property associated with a drug is basically an adjacency list where the row_indexrepresents an atom and each column value is the in_bond number/index. I deduced this from the following code snippet in the smiles2mpnnfeature method.

for a in range(n_atoms):
            for i,b in enumerate(in_bonds[a]):
                agraph[a,i] = b

Then, during the MPNN model training, I can see that the agraphs associated with all drugs in a given batch are combined into one combined agraph_1st variable. In this consolidation step, you used two variables N_a and N_b to maintain the running sum of atoms and bonds from each drug. Also, to re-index this combined adjacency list, I can see that you are adding the N_a value at each step (code snippet below).

for i in range(N_atoms_bond.shape[0]):
            ....
            agraph_lst.append(agraph[i,:atom_num,:] + N_a)

However, since the values in these adjacency lists are bond numbers or indices, shouldn't we be adding the N_b variable value instead? Could you please clarify this doubt of mine?

Thanks in advance!

SudhirGhandikota avatar May 30 '23 15:05 SudhirGhandikota

agraph_lst save the index of atom. N_a is the number of total atoms before the current molecule. so we need to add the value. it is not related to the bond number.

futianfan avatar May 30 '23 23:05 futianfan

Hello,

Thanks for your response. However, the agraph_1st variable is populated using the agraph value, which seems to contain the bond number right (based on the first code snippet)??

SudhirGhandikota avatar May 31 '23 00:05 SudhirGhandikota

not bond number. but atom number. thanks

futianfan avatar May 31 '23 00:05 futianfan