foldseek icon indicating copy to clipboard operation
foldseek copied to clipboard

Question: Does foldseek work well on structures with no aa sequences?

Open ML-J opened this issue 1 year ago • 2 comments

Greetings team! We are using foldseek for clustering on a batch of pdbs without aa sequence. We set the alignment-mode to 3di-alignment. But the result still shows that the pdbs in one cluster have low similarities( rmsd ~9) between each other.

The command we used is foldseek easy-cluster ./data cluster0.8 tmp -c 0.8 --alignment-mode 0

We tried series of optional parameters combinations with little success. Is foldseek suitable for clustering non-sequence strcutures? And how should we adjust the parameters to obtain better cluster results?

ML-J avatar Aug 23 '23 03:08 ML-J

What kind of data do you try to cluster? You could try to set a --tmscore-threshold. But just a disclaimer, foldseek is not not meant to cluster huge set of near identical structures but rather sets of multiple groups of protein structures.

martin-steinegger avatar Aug 23 '23 04:08 martin-steinegger

Thanks for your reply! Concretely, our structure files are obtained from generative models. The pdb files only contain the backbone atom coordinates and the residue types are all ALA.

ML-J avatar Aug 23 '23 07:08 ML-J