foldseek
foldseek copied to clipboard
Cluster purity analysis with structurealign?
Expected Behavior
Hi, I found the cluster purity analysis using structurealign here (https://www.biorxiv.org/content/10.1101/2023.03.09.531927v1). The representative structure was aligned to the cluster members using the "structurealign -e INF -a" module in Foldseek to calculate the average LDDT and average TM-score per cluster. Could you provide a more detailed guide about this? I'm not sure if I need to complete the analysis with a loop script.
We calculated the TM-scores and LDDT scores using 3Di/AA structural alignments (structurealign
). To obtain the TM-score and average LDDT score for the alignments we used convertalis
modul.
Should this analysis be completed in the clusters one by one using a loop script?
Could you please provide more details on that topic? Eg. how did you generate prefilterdb
comprising all query-target alignments which is a required input for structurealign
?
EDIT: isn't it so that all that steps can be done using just one command easy-search
with --exhaustive-search 1
?
We do have the scripts how to compute the purity per cluster here: https://github.com/steineggerlab/afdb-clusters-analysis