boltz icon indicating copy to clipboard operation
boltz copied to clipboard

How to run prediction for multiple protein sequences from the same yaml file

Open amoschoomy opened this issue 1 year ago • 1 comments

This is an exercpt of my yaml setup

sequences:
- protein:
    id: seq_2
    msa: seekgene_msa/N42_cdr3/seq_2.a3m
    sequence: CSVGTGDFGEQYF
- protein:
    id: seq_4
    msa: seekgene_msa/N42_cdr3/seq_4.a3m
    sequence: CATSFSGPEQFF
- protein:
    id: seq_6
    msa: seekgene_msa/N42_cdr3/seq_6.a3m
    sequence: CASSLHLGTGGTYEQYF

ValueError: All proteins with the same sequence must share the same MSA!

How to format my yaml correctly such that i can run multiple protein sequence sin the same yaml file?

amoschoomy avatar Feb 25 '25 03:02 amoschoomy

See this example from the prediction instructions:

version: 1 sequences:

  • protein: id: [A, B] sequence: MVTPEGNVSLVDESLLVGVTDEDRAVRSAHQFYERLIGLWAPAVMEAAHELGVFAALAEAPADSGELARRLDCDARAMRVLLDALYAYDVIDRIHDTNGFRYLLSAEARECLLPGTLFSLVGKFMHDINVAWPAWRNLAEVVRHGARDTSGAESPNGIAQEDYESLVGGINFWAPPIVTTLSRKLRASGRSGDATASVLDVGCGTGLYSQLLLREFPRWTATGLDVERIATLANAQALRLGVEERFATRAGDFWRGGWGTGYDLVLFANIFHLQTPASAVRLMRHAAACLAPDGLVAVVDQIVDADREPKTPQDRFALLFAASMTNTGGGDAYTFQEYEEWFTAAGLQRIETLDTPMHRILLARRATEPSAVPEGQASENLYFQ msa: ./examples/msa/seq1.a3m
  • ligand: id: [C, D] ccd: SAH
  • ligand: id: [E, F] smiles: 'NC@@HC(=O)O'

hughhigin avatar Mar 06 '25 19:03 hughhigin