boltz icon indicating copy to clipboard operation
boltz copied to clipboard

Template Issue and hacky fix

Open cooperstergisjamieson opened this issue 5 months ago • 4 comments

I was getting the following error when running with templates:

Traceback (most recent call last):
  File "/home/user/bin/boltz-v2/src/boltz/main.py", line 499, in process_input
    target = parse_yaml(path, ccd, mol_dir, boltz2)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/bin/boltz-v2/src/boltz/data/parse/yaml.py", line 68, in parse_yaml
    return parse_boltz_schema(name, data, ccd, mol_dir, boltz2)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/bin/boltz-v2/src/boltz/data/parse/schema.py", line 1599, in parse_boltz_schema
    parsed_template = parse_mmcif(
                      ^^^^^^^^^^^^
  File "/home/user/bin/boltz-v2/src/boltz/data/parse/mmcif.py", line 1201, in parse_mmcif
    chains = np.array(chain_data, dtype=Chain)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: could not assign tuple of length 10 to structure with 11 fields.

It seems like these lines must be modified to run with a template:

boltz/src/data/parse/mmcif.py

class ParsedChain:
    """A parsed chain object."""

    name: str
    entity: str
    type: int
    residues: list[ParsedResidue]
    sequence: Optional[str] = None
    affinity: bool = True # added

boltz/src/data/parse/mmcif.py

chain_data.append(
            (
                chain.name,
                chain.type,
                entity_id,
                sym_id,
                asym_id,
                atom_idx,
                atom_num,
                res_idx,
                res_num,
                0,
                chain.affinity, # added
            )
        )

This seems to fix the preprocessing and facilitates running with a template. However...my little hack messed up the featurization for affinity predictions, specifically during compute_template_features.

Also, it may be good to remove the hardcoded value from chain_data.

cooperstergisjamieson avatar Jun 06 '25 23:06 cooperstergisjamieson

I believe this should be fixed in 2.0.3. Do make sure to delete your current output folder and let me know if the problem persists

jwohlwend avatar Jun 07 '25 05:06 jwohlwend

Hi @jwohlwend, Thanks for the quick response. The update now works for template constraints, but fails for affinity predictions when ran with a template constraint. The issue still arises during compute_template_features.

Traceback (most recent call last):
  File "/home/user/bin/boltz/src/boltz/data/module/inferencev2.py", line 274, in __getitem__
    features = self.featurizer.process(
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/bin/boltz/src/boltz/data/feature/featurizerv2.py", line 2175, in process
    template_features = process_template_features(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/bin/boltz/src/boltz/data/feature/featurizerv2.py", line 1797, in process_template_features
    row_features = compute_template_features(data, row_tokens, max_tokens)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/bin/boltz/src/boltz/data/feature/featurizerv2.py", line 1693, in compute_template_features
    query_token = query_tokens.tokens[idx]
                  ~~~~~~~~~~~~~~~~~~~^^^^^
IndexError: index 293 is out of bounds for axis 0 with size 240

cooperstergisjamieson avatar Jun 09 '25 16:06 cooperstergisjamieson

Just letting you know I've been running into this same error which appears to be the same as issue #323

noahharrison64 avatar Jun 10 '25 13:06 noahharrison64

Same issue for me. My run works fine without specifying a template.


Traceback (most recent call last):
  File "/home/andre/environments/venv_boltz2/lib/python3.11/site-packages/boltz/data/module/inferencev2.py", line 274, in __getitem__
    features = self.featurizer.process(
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/andre/environments/venv_boltz2/lib/python3.11/site-packages/boltz/data/feature/featurizerv2.py", line 2175, in process
    template_features = process_template_features(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/andre/environments/venv_boltz2/lib/python3.11/site-packages/boltz/data/feature/featurizerv2.py", line 1797, in process_template_features
    row_features = compute_template_features(data, row_tokens, max_tokens)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/andre/environments/venv_boltz2/lib/python3.11/site-packages/boltz/data/feature/featurizerv2.py", line 1693, in compute_template_features
    query_token = query_tokens.tokens[idx]
                  ~~~~~~~~~~~~~~~~~~~^^^^^
IndexError: index 209 is out of bounds for axis 0 with size 209

fisand08 avatar Jun 16 '25 12:06 fisand08

I also have a similar issue while I use a template and affinity calculation. I get a good prediction but no affinity calculation.

Traceback (most recent call last): File "/home/cryoem/.conda/envs/boltz2/lib/python3.11/site-packages/boltz/data/module/inferencev2.py", line 274, in getitem features = self.featurizer.process( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/cryoem/.conda/envs/boltz2/lib/python3.11/site-packages/boltz/data/feature/featurizerv2.py", line 2175, in process template_features = process_template_features( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/cryoem/.conda/envs/boltz2/lib/python3.11/site-packages/boltz/data/feature/featurizerv2.py", line 1797, in process_template_features row_features = compute_template_features(data, row_tokens, max_tokens) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/cryoem/.conda/envs/boltz2/lib/python3.11/site-packages/boltz/data/feature/featurizerv2.py", line 1693, in compute_template_features query_token = query_tokens.tokens[idx]

IndexError: index 234 is out of bounds for axis 0 with size 225

Riccardo97-13 avatar Jun 19 '25 13:06 Riccardo97-13

I hit the same thing today. The template really improves the model, but then the affinity prediction fails.

seanrjohnson avatar Jun 19 '25 17:06 seanrjohnson

I had the same issue, however simply changing featurizerv2.py line 2174, to include a simple check:

        if not compute_affinity and data.templates:
               template_features = process_template_features(....

I don't see why the template would be needed for the affinity prediction, and this function gets called twice, once with compute_affinity=False, and then with compute_affinity=True.

leroybird avatar Jun 23 '25 05:06 leroybird

I had the same issue, however simply changing featurizerv2.py line 2174, to include a simple check:

        if not compute_affinity and data.templates:
               template_features = process_template_features(....

I don't see why the template would be needed for the affinity prediction, and this function gets called twice, once with compute_affinity=False, and then with compute_affinity=True.

Dear Leroybird Thank you, I tried your suggestion and now everything works fine.

Riccardo97-13 avatar Jun 23 '25 12:06 Riccardo97-13

When I use the provided template for only structure predictions, it works correctly and returns the expected results.

However, when I attempt to use the same setup to predict affinity, the program fails and throws an error.

sequences:
  - protein:
      id: [D]
      sequence: LQPVIGIISIDNYDDTIESLADADVSQINGFIANFISEFAQSREIFYRRVNMDRFYFFTDYSVLDQLIQDKFEVLEQFRKEAQERHLPLTLSMGISYGNANHSQIGQIALKNLNIALVRGGDQAVIRENDEHKKLLYFGGGTVSTIKRSRTRTRAMMTAISDKIKTVDSVFVVGHKNLDMDALGASVGMQAFANNIIEHAAYAVYDEDSMSHDVARAVNRLKEDGHTQLLTVKESIEQVSDNSLLVMVDHSKLQLTLSRELYNKFTEVIVIDHHRRDDDFPENAILTFIESGASSASELVTELLQFQNGKYHLNKIQASIVMAGIMLDTKSFSTRVTSRTFDVASYLRTLGSDNVEIQNISALDFDEYRLINELILRGDRILPNVVVATGADDISYSNVIASKAADTMLNMAGIEATFVITRNDERTVCISARSRNKINVQRIMEEMGGGGHFNLAACQLKGTSVKEARKLLLEKIKEE
  - ligand:
      id: [C]
      smiles: 'Nc1ncnc2c1ncn2C1OC2COP(=O)([O-])OC3C(COP(=O)([O-])OC2C1O)OC(n1cnc2c(N)ncnc21)C3O'
templates:
    - cif: chain_D.cif
      chain_id: D
properties:
  - affinity:
      binder: C

template link: https://drive.google.com/file/d/1gdAFvoJEQWc7axftYZi2Vhiq440f2inq/view?usp=sharing

Error information:

Traceback (most recent call last):
  File "/ext3/miniconda3/envs/pyg/lib/python3.11/site-packages/boltz/data/module/inferencev2.py", line 274, in __getitem__
    features = self.featurizer.process(
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ext3/miniconda3/envs/pyg/lib/python3.11/site-packages/boltz/data/feature/featurizerv2.py", line 2175, in process
    template_features = process_template_features(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ext3/miniconda3/envs/pyg/lib/python3.11/site-packages/boltz/data/feature/featurizerv2.py", line 1797, in process_template_features
    row_features = compute_template_features(data, row_tokens, max_tokens)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ext3/miniconda3/envs/pyg/lib/python3.11/site-packages/boltz/data/feature/featurizerv2.py", line 1693, in compute_template_features
    query_token = query_tokens.tokens[idx]
                  ~~~~~~~~~~~~~~~~~~~^^^^^
IndexError: index 242 is out of bounds for axis 0 with size 238

xiaolinpan avatar Jul 12 '25 14:07 xiaolinpan