alphafill
alphafill copied to clipboard
Use foldseek to find structures?
First of all, thanks for releasing a great tool. This seems to be useful not only for Alphafold structures or homology models, but also for transferring ligands across PDB structures - very neat.
I have a suggestion for an improvement. For identifying homolog structures, you use sequenced based BLAST. However, the Steinegger lab has released an excellent -and very fast tool- Foldseek that can do structural searches. It will provide both more accurate results (safer transfers), and speed up the search for homologs. Moreover, it is fairly simple to run just like BLAST. For details, see https://github.com/steineggerlab/foldseek.
Again, thanks for making the tool generally available - and the results through a web site.
That sounds interesting. It would be really cool if FoldSeek can search the PDB-REDO databank. Is there something we can do from our side to make that easier? A practical questing that has to do with scaling: What is the typical running time for Foldseek over the PDB? For a tool like AlphaFill (or rather for the databank), speed matters as we have to do things a million times.
If you have all PDB files with ligands in a directory, making a foldseek database is one command line. Or you can use their pre-built PDB database. The readme on their github nicely explains the few commands you need to run.
For search, it is -surprisingly- even faster than BLAST, so replacing BLAST with Foldseek will speed up the flow. The output looks like BLASTs m8/tab format, so blast/foldseek choice can even be an option(?). The only challenge is to find a sensible cutoff. Foldseek is far more sensitive than BLAST so you would need to set a very low evalue cut off. We can ask the foldseek guys for guidance.
might have a look at foldseek, if I can find the time.