boltz
boltz copied to clipboard
How to run prediction for multiple protein sequences from the same yaml file
This is an exercpt of my yaml setup
sequences:
- protein:
id: seq_2
msa: seekgene_msa/N42_cdr3/seq_2.a3m
sequence: CSVGTGDFGEQYF
- protein:
id: seq_4
msa: seekgene_msa/N42_cdr3/seq_4.a3m
sequence: CATSFSGPEQFF
- protein:
id: seq_6
msa: seekgene_msa/N42_cdr3/seq_6.a3m
sequence: CASSLHLGTGGTYEQYF
ValueError: All proteins with the same sequence must share the same MSA!
How to format my yaml correctly such that i can run multiple protein sequence sin the same yaml file?
See this example from the prediction instructions:
version: 1 sequences:
- protein: id: [A, B] sequence: MVTPEGNVSLVDESLLVGVTDEDRAVRSAHQFYERLIGLWAPAVMEAAHELGVFAALAEAPADSGELARRLDCDARAMRVLLDALYAYDVIDRIHDTNGFRYLLSAEARECLLPGTLFSLVGKFMHDINVAWPAWRNLAEVVRHGARDTSGAESPNGIAQEDYESLVGGINFWAPPIVTTLSRKLRASGRSGDATASVLDVGCGTGLYSQLLLREFPRWTATGLDVERIATLANAQALRLGVEERFATRAGDFWRGGWGTGYDLVLFANIFHLQTPASAVRLMRHAAACLAPDGLVAVVDQIVDADREPKTPQDRFALLFAASMTNTGGGDAYTFQEYEEWFTAAGLQRIETLDTPMHRILLARRATEPSAVPEGQASENLYFQ msa: ./examples/msa/seq1.a3m
- ligand: id: [C, D] ccd: SAH
- ligand: id: [E, F] smiles: 'NC@@HC(=O)O'