ProDy
ProDy copied to clipboard
Excluding contacts from flanking residues
Hi ProDy team - I keep finding myself having to loop through the tuples returned by findNeighbors to look up the atomic indices and find the residue number in my original model, and then exclude any contacts from a given number of flanking residues (depending on the contact radius I have chosen.) This is the slowest part of my analysis, and when I'm going through >30K pdb files it can end up taking days just for this one step. Is it possible to exclude contacts from n flanking residues from the contact list as it is being generated? If not, is this a feature you would consider adding?
It should be doable with some scripting, but it sounds like a good feature to have. We'll look into it at some point soon.
Ok, I have now added the following behaviour:
In [1]: from prody import *
In [2]: ag = parsePDB('1hk3', chain='A', subset='ca', compressed=False)
@> PDB file is found in working directory (1hk3.pdb).
@> 559 atoms and 1 coordinate set(s) were parsed in 0.01s.
@> Secondary structures were assigned to 456 residues.
In [3]: sel = ag.select('resnum 1 to 50')
In [4]: conts = findNeighbors(sel, radius=8)
In [5]: len(conts)
Out[5]: 182
In [6]: conts = findNeighbors(sel, radius=8, seqdist=3)
In [7]: len(conts)
Out[7]: 55
There is now a new argument seqdist for iterNeighbors and findNeighbors that filters out contacts with a sequence distance > seqdist. Please let me know if you want something different
We also have an interactions module that has a kwarg seq_cutoff that is a sequential atom index cutoff.
I have now made iterNeighbors and findNeighbors use seqsep like evol rank order, and have also added this argument to buildDistMatrix.