foldseek
foldseek copied to clipboard
A observed discrepency between alignment-type 3Di+AA / 3Di
Expected Behavior
Thanks for your amazing tool! I am clustering a bunch of afdb subset which has high confidence with two alignment-type 3Di+AA / 3Di. In my intuition, 3Di should give more non-singleton cluster compared to 3Di+AA, because the very diverse sequence which hold same structure will be assined to same cluster in 3Di mode, and assigned to different clusters in 3Di+AA mode.
Current Behavior
I test the cluster command of two aliment types on the same database(a subset contains 4 million afdb structure), --alignment-type 0(3Di) gives me 470715 singleton --alignment-type 1(3Di) gives me 759500 singleton
this is the cluster command i use: foldseek cluster afdb50_new afdb50_new_clust_v2 tmp --remove-tmp-files --alignment-type 0/--alignment-type 2
Is this the epxpected result? It looks the cluster program based on solely 3Di token work worse than 3Di+AA, what;s your suggestion if i want to cluster on structure without AA token?
Any help will be gratitude!