templates - working example
Can you provide a working example for using templates? I tried applying templates, everything worked, but the result isn't affected by the template.
Yeah I have the same issue, added a custom cif file as template but the structure is not matching the (relative) atom coordinates of the inputted template file. In the paper it says:
In Boltz-2, we improve on these two fronts: we allow for multimeric templates and we allow the user to strictly enforce that templates are respected via a Boltz-steering potential.
How can we enforce the Boltz-steering potential?
Just noticed that decreasing the msa depth really improves it. I added the flag --max_msa_seqs 32 and the structure was much closer to the template. A deep MSAs can improve predictions though of regions that are not in the template so I guess thats where the trade off is.
Just noticed that decreasing the msa depth really improves it. I added the flag --max_msa_seqs 32 and the structure was much closer to the template. A deep MSAs can improve predictions though of regions that are not in the template so I guess thats where the trade off is.
Thanks!
would still love to hear some input from the boltz team though what they meant with "we allow the user to strictly enforce that templates are respected via a Boltz-steering potential".
Perhaps we dont have to sacrifice MSA depth for enforcing the template..
I am unable to find any example of the correct way to specify a template in the YAML file. Would someone please be kind enough to post an example here?
The documentation at this link gives the syntax for specifying templates:
https://github.com/jwohlwend/boltz/blob/main/docs/prediction.md
I am unable to find any example of the correct way to specify a template in the YAML file. Would someone please be kind enough to post an example here?
version: 1
sequences:
- protein:
id: [A]
sequence: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
templates:
- cif: template path (i.e template.cif or templates/temp.cif)
===
I can also confirm that simple inclusion of a template does not seem to do much in my case unless MSA depth is sacrificed.
Thanks to @parksjm and @ruslan-uoc for pointing me to the examples.
One thing I have found is that template steering works as long as I use exactly the same file that I downloaded from RCSB PDB. But if I tweak the file in PyMol--say, by retaining only chain A and removing the other chains--and then export from PyMol to a CIF file, the custom file results in the following error:
File "/home/username/projects/boltz/src/boltz/data/parse/mmcif.py", line 869, in parse_mmcif
for chain in structure[0]:
Am I missing a trick here? Is there a specific way in which I should save the file in PyMol after modifying it?
Is there more info in your error message? I know that for the custom template to work you need to have (beyond the _atom_site loop), the _entity. loops as well as _entity_poly, _entity_poly_seq and _struct_asym in your cif file. Perhaps any of these is missing?
With regards to template steering, apparently according to the Boltz team these potentials are being pushed to the repo soon.
@pjmbro here is the full error:
$ boltz predict ./242291908-3.yaml --out_dir ./testrun5/ --cache ~/applications/boltz/model/ --use_msa_server
Checking input data. Processing 1 inputs with 1 threads. 0%| | 0/1 [00:00<?, ?it/s]Traceback (most recent call last): File "/home/username/projects/repos/boltz/src/boltz/main.py", line 499, in process_input target = parse_yaml(path, ccd, mol_dir, boltz2) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/username/projects/repos/boltz/src/boltz/data/parse/yaml.py", line 68, in parse_yaml return parse_boltz_schema(name, data, ccd, mol_dir, boltz2) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/username/projects/repos/boltz/src/boltz/data/parse/schema.py", line 1619, in parse_boltz_schema parsed_template = parse_mmcif( ^^^^^^^^^^^^ File "/home/username/projects/repos/boltz/src/boltz/data/parse/mmcif.py", line 869, in parse_mmcif for chain in structure[0]: ~~~~~~~~~^^^ IndexError Failed to process 242291908-3.yaml. Skipping. Error: .
And you are correct. In the modified file, loops, poly, poly_seq, etc., are all missing. I need to look into getting PyMol to export to CIF while keeping those fields intact.
ChimeraX writes mmCIF format including many metadata tables. It may produce an mmCIF file that Boltz can use.
ChimeraX writes mmCIF format including many metadata tables. It may produce an mmCIF file that Boltz can use.
I'm sorry for just dropping in, but I had the same problem with cif files from distinct origin... and the same error after parsing the file was 'for chain in structure[0]'.. I tried many ways to solve the task. My friend told me he used template cif taken from rcsb and any changes aborted the process. I even thought to use something like this https://sw-tools.rcsb.org/ and asked my colleagues to send me a MOE output.. And finally, Yes, ChimeraX gave a really approrpiate file format (mmCIF). Thank You for the advice
@tomgoddard , @rayevsky1985 , that is very helpful information indeed. Thank you very much.
Just noticed that decreasing the msa depth really improves it. I added the flag --max_msa_seqs 32 and the structure was much closer to the template. A deep MSAs can improve predictions though of regions that are not in the template so I guess thats where the trade off is.
Unfortunately, I didn't find the flag max_msa_seqs or the way of the value assignnemt to the --use_msa_server key. So, actually, I couldn't nail the conformation to the template (that was critical for me), however, the mechanism of template preparation was uncovered :)
Not all packages which convert .pdb files to .cif files correctly handle the parsing of modern PDBx style subchains and correct identification of the corresponding entity polymer type which seems to be causing the incompatibility issues. We just pushed direct support for .pdb file templates so manual conversion isn't neccessry and hopefully this is able to resolve this issue! Please let us know if this works for your setup!
I found this script from Boltz's Slack community. This works for me to convert .pdb into .cif.
Usage:
python pdb2cifgemmi.py [input.pdb] -o [output_dir]