proteinfold icon indicating copy to clipboard operation
proteinfold copied to clipboard

Improved support for non-protein entities

Open tlitfin-unsw opened this issue 5 months ago • 0 comments

Description of feature

AlphaFold3, RoseTTAFold-All-Atom, Boltz and HelixFold3 all have the ability to model non-protein entities.

AlphaFold3 currently only supports protein monomers.

Boltz, RoseTTAFold-All-Atom and HelixFold3 support non-protein entities via mode-specific file formats but this does not allow multiple modes to be run simultaneously.

Currently, boltz is the only mode to support non-protein entities via FASTA format.

It is supported by designating the entity type (protein, rna, dna, smiles, ccd) in the fasta header (eg >A|protein) and guessing molecule type as a fallback (implemented here).

General support for non-protein entities could be provided by:

  • Adopting the current boltz implementation:
    • overloading FASTA header
    • fallback to type guessing

OR

  • Modifying the samplesheet schema to contain enttity-level fields which could be assembled to module-specific formats using a proteinfold utility:
    • protein_fasta
    • rna_fasta
    • dna_fasta
    • smiles_fasta
    • ccd_fasta

tlitfin-unsw avatar Aug 03 '25 11:08 tlitfin-unsw