ash icon indicating copy to clipboard operation
ash copied to clipboard

Bug & issue report

Open Zuttergutao opened this issue 1 year ago • 17 comments

Hi RagnarB83,

I’ve been working on learning ASH recently, and I’ve encountered several issues while going through the ASH tutorial. I’d like to report them here:

Problems:

  1. The solvate_small_molecule function doesn’t seem to work. I’m getting a TypeError: solvate_small_molecule received an unexpected keyword argument “nonbonded_pars.”
  2. While running a QM/MM geometry optimization for a protein’s QM region (around 100 atoms), the process struggles to meet the convergence criteria and tends to get stuck in local minima. This happens even when I set the number of optimization steps to 500. The converge results were 1e-2/1e-2 (rms/max) for the gradient and 1e-3/1e-3 (rms/max) for the displace.

Zuttergutao avatar Oct 18 '24 00:10 Zuttergutao

  1. When I run the Surface_scan job using the QMMMTheory method, I encounter an AttributeError: 'QMMMTheory' object has no attribute 'Filename'.

Zuttergutao avatar Oct 18 '24 00:10 Zuttergutao

Hi, thanks for reporting. I appreciate someone going through the tutorials and reporting back, this the only way of making sure they remain up-to-date.

  1. Solvate_small_molecule function: Unfortunately, the behaviour of this function changed, and the Metadynamics tutorial had not been properly updated. Previously, the function both creates the small-molecule FF and solvated, now it only solvates and requires the FF to have been created by another function. This is now documented on the page, let me know if it's clear.

  2. So, it is often the case for newly set-up protein systems that the first geometry optimization is quite tough to converge since all the atom positions are quite bad (coming either from an X-ray structure or from an MM simulation), so sometimes a few hundred optimization cycles are required. One can speed things up for this first optimization by choosing a cheap QM-level of theory and then switch to the more accurate QM-theory later. Later optimizations will be easier to converge. This typically applies when you have a large active region (not QM-region) of e.g. a 1000 atoms. It's also possible that to try another internal-coordinate system, the default is HDLC. If you send me the outputfile I can also check if something abnormal is going on.

  3. Probably a bug, let me take a look.

RagnarB83 avatar Oct 21 '24 11:10 RagnarB83

  1. I fixed the bug that gave an Attribute error for scans for QMMMTheory objects.

The fix is available on the new branch. Will be on the main branch in a week or so after a merge.

RagnarB83 avatar Oct 21 '24 11:10 RagnarB83

Hi, thanks for reporting. I appreciate someone going through the tutorials and reporting back, this the only way of making sure they remain up-to-date.

  1. Solvate_small_molecule function: Unfortunately, the behaviour of this function changed, and the Metadynamics tutorial had not been properly updated. Previously, the function both creates the small-molecule FF and solvated, now it only solvates and requires the FF to have been created by another function. This is now documented on the page, let me know if it's clear.
  2. So, it is often the case for newly set-up protein systems that the first geometry optimization is quite tough to converge since all the atom positions are quite bad (coming either from an X-ray structure or from an MM simulation), so sometimes a few hundred optimization cycles are required. One can speed things up for this first optimization by choosing a cheap QM-level of theory and then switch to the more accurate QM-theory later. Later optimizations will be easier to converge. This typically applies when you have a large active region (not QM-region) of e.g. a 1000 atoms. It's also possible that to try another internal-coordinate system, the default is HDLC. If you send me the outputfile I can also check if something abnormal is going on.
  3. Probably a bug, let me take a look.

Thanks for replying! However, i still encounter some problems.

  1. Regarding problem 1.

    • The option 1 did not work.the error is `you must have at least OpenMM 6.3 Installed". But I have installed OpenMM 8.1.2.
    • The option 2 works. But the mtd script gave the valueError ValueError: Found multiple NonbondedForce tags with different 1-4 scales Herei is the generated forcefield files. I think the problem is the difference of coulomb14scale between the LIG.xml you provided (0.83333) and I generated (1.0) ff.zip
  2. Regarding Problem 2. The following is the file I used. I am a beginner in QM/MM and hope you can give me some advice and guidance. When I set the convergence limit to superloose, it converged quickly. input.zip

  3. Regarding Problem 3. It works!

Zuttergutao avatar Oct 21 '24 16:10 Zuttergutao

  1. I fixed the bug that gave an Attribute error for scans for QMMMTheory objects.

The fix is available on the new branch. Will be on the main branch in a week or so after a merge.

Interestingly, it was able to do the calculations. But there were some problems with the output (I was using QM/MM two CV scans).

 File "/home/casea/CASEADATA/Research Project/Pathway/BmFaldDH/Amber/ASH/PESScan/orca_pes.py", line 60, in <module>
   results = calc_surface(fragment=frag, theory=qmmmobject, scantype='UnRelaxed', resultfile='surface_results.txt', runmode='serial', 
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/site-packages/ash/modules/module_surface.py", line 507, in calc_surface
   result.write_to_disk(filename="ASH_surface.result")
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/site-packages/ash/modules/module_results.py", line 122, in write_to_disk
   f.write(json.dumps(newdict, allow_nan=True))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/json/__init__.py", line 231, in dumps
   return _default_encoder.encode(obj)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/json/encoder.py", line 200, in encode
   chunks = self.iterencode(o, _one_shot=True)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/json/encoder.py", line 258, in iterencode
   return _iterencode(o, 0)
          ^^^^^^^^^^^^^^^^^
TypeError: keys must be str, int, float, bool or None, not tuple  File "/home/casea/CASEADATA/Research Project/Pathway/BmFaldDH/Amber/ASH/PESScan/orca_pes.py", line 60, in <module>
   results = calc_surface(fragment=frag, theory=qmmmobject, scantype='UnRelaxed', resultfile='surface_results.txt', runmode='serial', 
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/site-packages/ash/modules/module_surface.py", line 507, in calc_surface
   result.write_to_disk(filename="ASH_surface.result")
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/site-packages/ash/modules/module_results.py", line 122, in write_to_disk
   f.write(json.dumps(newdict, allow_nan=True))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/json/__init__.py", line 231, in dumps
   return _default_encoder.encode(obj)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/json/encoder.py", line 200, in encode
   chunks = self.iterencode(o, _one_shot=True)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/json/encoder.py", line 258, in iterencode
   return _iterencode(o, 0)
          ^^^^^^^^^^^^^^^^^
TypeError: keys must be str, int, float, bool or None, not tuple

In the meantime, is it possible to write a function so that each cv energy calculated can be output when it is updated, rather than outputting all together at the end?

Zuttergutao avatar Oct 21 '24 18:10 Zuttergutao

  • The option 1 did not work.the error is `you must have at least OpenMM 6.3 Installed". But I have installed OpenMM 8.1.2.

OK interesting, so this is not an ASH error message but a flawed error message coming from the parmed library dependency. Do you mind posting the full error message (if something more) and the version of the parmed library that was installed (and whether you installed it using pip or conda/mamba).

  • The option 2 works. But the mtd script gave the valueError ValueError: Found multiple NonbondedForce tags with different 1-4 scales Here is the generated forcefield files. I think the problem is the difference of coulomb14scale between the LIG.xml you provided (0.83333) and I generated (1.0) ff.zip

So this is a bit of an annoying thing that happens due to the difference between CHARMM-style and Amber-style forcefields and the way OpenMM handles this, and then ASH-generated XML-files vs. pairing them with a solvent forcefield like TIP3P that can also exist in CHARMM and Amber forms. This is mentioned in the ASH documentation here: https://ash.readthedocs.io/en/latest/Explicit-solvation.html#issues

I will have to update the tutorials once I'm done travelling to make sure this problem is avoided. Pairing the solute FF XML-file with e.g. the Amber-style TIP3P forcefield: "amber14/tip3p.xml" should work to avoid this problem right now.

  1. Regarding Problem 2. The following is the file I used. I am a beginner in QM/MM and hope you can give me some advice and guidance. When I set the convergence limit to superloose, it converged quickly. input.zip

OK, so first a couple of things. Visualizing your system coordinates I see that the system was created in Amber and using a hexagonal unit cell system. Also the protein sticks out of the water box when the system is interpreted non-periodically. For QM/MM in ASH, the QM-part will always have to be done non-periodically so it is going to be necessary to first wrap the coordinates so that the protein is in the center of the box in order to do a sensible calculation. It is usually also better to have simpler cubic boxes for this purpose. If this is not an option

I would not at all use the superloose convergence setting, this is a dummy-setting just for basic testing.

For the very first QM/MM geometry optimization of a new system with a sizeable active region you should be prepared that many geometry optimization steps will initially be required. This is because your active region contains quite a lot of atoms (466 in your case), meaning 1398 degrees of freedom (quite a bit) and all of the atoms have large forces acting on them in the very beginning (since the system is an X-ray structure or from an MM simulation), this is simply a tough minimization problem that requires a lot of iterations. So it may actually be perfectly normal for the geometry optimization not too converge using the default max-iterations (or even 500). Without seeing the outputfile I can't really tell whether things were progressing well or not. Note that even though it sometimes seems that the system is never getting close to convergence (i.e. the gradient keep oscillating) but the energy is still getting lower and lower, then this still indicates that the optimization is progressing and will probably converge. Only if the energy is completely stuck (i.e. not changing or zig-zagging), would I say that something is really wrong. I would also visualize the geometry optimization trajectory to make sure things look like they are behaving, i.e. atoms are moving in a sensible way.

So for many new protein setups you may have to set maxiter to a really large number (1000 or more) or restart the optimization a couple of times (use the last set of coordinates in each restart). Because this can actually be quite expensive to do at the QM/MM level, it is good to make the QM-part really cheap in the beginning while you are in this phase. For example: use a small-basis set (double-zeta), use a non-hybrid (e.g. BP86 instead of B3LYP), use a very small QM-region (just the bare minimum). Another option is to switch to a semi-empirical level of theory during this phase, and here your best bet is probably to use the xtb-program with either the GFN1-xTB or GFN2-XTB method since you have Zn. Once you have been able to complete this initial optimization of the new system, things will be much easier. You can switch to your desired theory-level again and perhaps a larger QM-region and you are likely to reconverge in less than 100 iterations. When you change the geometry of the QM-region, e.g. exploring another isomer or conformer, you are also likely to converge in less than 100 iterations. It is just in this first stage where things will really take a long time, or if you increase the active region.

RagnarB83 avatar Oct 21 '24 21:10 RagnarB83

  1. I fixed the bug that gave an Attribute error for scans for QMMMTheory objects.

The fix is available on the new branch. Will be on the main branch in a week or so after a merge.

Interestingly, it was able to do the calculations. But there were some problems with the output (I was using QM/MM two CV scans).

 File "/home/casea/CASEADATA/Research Project/Pathway/BmFaldDH/Amber/ASH/PESScan/orca_pes.py", line 60, in <module>
   results = calc_surface(fragment=frag, theory=qmmmobject, scantype='UnRelaxed', resultfile='surface_results.txt', runmode='serial', 
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/site-packages/ash/modules/module_surface.py", line 507, in calc_surface
   result.write_to_disk(filename="ASH_surface.result")
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/site-packages/ash/modules/module_results.py", line 122, in write_to_disk
   f.write(json.dumps(newdict, allow_nan=True))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/json/__init__.py", line 231, in dumps
   return _default_encoder.encode(obj)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/json/encoder.py", line 200, in encode
   chunks = self.iterencode(o, _one_shot=True)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/json/encoder.py", line 258, in iterencode
   return _iterencode(o, 0)
          ^^^^^^^^^^^^^^^^^
TypeError: keys must be str, int, float, bool or None, not tuple  File "/home/casea/CASEADATA/Research Project/Pathway/BmFaldDH/Amber/ASH/PESScan/orca_pes.py", line 60, in <module>
   results = calc_surface(fragment=frag, theory=qmmmobject, scantype='UnRelaxed', resultfile='surface_results.txt', runmode='serial', 
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/site-packages/ash/modules/module_surface.py", line 507, in calc_surface
   result.write_to_disk(filename="ASH_surface.result")
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/site-packages/ash/modules/module_results.py", line 122, in write_to_disk
   f.write(json.dumps(newdict, allow_nan=True))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/json/__init__.py", line 231, in dumps
   return _default_encoder.encode(obj)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/json/encoder.py", line 200, in encode
   chunks = self.iterencode(o, _one_shot=True)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/casea/mambaforge/envs/ASH/lib/python3.11/json/encoder.py", line 258, in iterencode
   return _iterencode(o, 0)
          ^^^^^^^^^^^^^^^^^
TypeError: keys must be str, int, float, bool or None, not tuple

In the meantime, is it possible to write a function so that each cv energy calculated can be output when it is updated, rather than outputting all together at the end?

This might actually be a bug, if you send me the inputscript and coordinate-files I will take a look at this and try to reproduce.

RagnarB83 avatar Oct 21 '24 21:10 RagnarB83

FYI: the tutorials have been updated so that the

ValueError: Found multiple NonbondedForce tags

problem is avoided.

RagnarB83 avatar Oct 24 '24 08:10 RagnarB83

  • The option 1 did not work.the error is `you must have at least OpenMM 6.3 Installed". But I have installed OpenMM 8.1.2.

OK interesting, so this is not an ASH error message but a flawed error message coming from the parmed library dependency. Do you mind posting the full error message (if something more) and the version of the parmed library that was installed (and whether you installed it using pip or conda/mamba).

  • The option 2 works. But the mtd script gave the valueError ValueError: Found multiple NonbondedForce tags with different 1-4 scales Here is the generated forcefield files. I think the problem is the difference of coulomb14scale between the LIG.xml you provided (0.83333) and I generated (1.0) ff.zip

So this is a bit of an annoying thing that happens due to the difference between CHARMM-style and Amber-style forcefields and the way OpenMM handles this, and then ASH-generated XML-files vs. pairing them with a solvent forcefield like TIP3P that can also exist in CHARMM and Amber forms. This is mentioned in the ASH documentation here: https://ash.readthedocs.io/en/latest/Explicit-solvation.html#issues

I will have to update the tutorials once I'm done travelling to make sure this problem is avoided. Pairing the solute FF XML-file with e.g. the Amber-style TIP3P forcefield: "amber14/tip3p.xml" should work to avoid this problem right now.

  1. Regarding Problem 2. The following is the file I used. I am a beginner in QM/MM and hope you can give me some advice and guidance. When I set the convergence limit to superloose, it converged quickly. input.zip

OK, so first a couple of things. Visualizing your system coordinates I see that the system was created in Amber and using a hexagonal unit cell system. Also the protein sticks out of the water box when the system is interpreted non-periodically. For QM/MM in ASH, the QM-part will always have to be done non-periodically so it is going to be necessary to first wrap the coordinates so that the protein is in the center of the box in order to do a sensible calculation. It is usually also better to have simpler cubic boxes for this purpose. If this is not an option

I would not at all use the superloose convergence setting, this is a dummy-setting just for basic testing.

For the very first QM/MM geometry optimization of a new system with a sizeable active region you should be prepared that many geometry optimization steps will initially be required. This is because your active region contains quite a lot of atoms (466 in your case), meaning 1398 degrees of freedom (quite a bit) and all of the atoms have large forces acting on them in the very beginning (since the system is an X-ray structure or from an MM simulation), this is simply a tough minimization problem that requires a lot of iterations. So it may actually be perfectly normal for the geometry optimization not too converge using the default max-iterations (or even 500). Without seeing the outputfile I can't really tell whether things were progressing well or not. Note that even though it sometimes seems that the system is never getting close to convergence (i.e. the gradient keep oscillating) but the energy is still getting lower and lower, then this still indicates that the optimization is progressing and will probably converge. Only if the energy is completely stuck (i.e. not changing or zig-zagging), would I say that something is really wrong. I would also visualize the geometry optimization trajectory to make sure things look like they are behaving, i.e. atoms are moving in a sensible way.

So for many new protein setups you may have to set maxiter to a really large number (1000 or more) or restart the optimization a couple of times (use the last set of coordinates in each restart). Because this can actually be quite expensive to do at the QM/MM level, it is good to make the QM-part really cheap in the beginning while you are in this phase. For example: use a small-basis set (double-zeta), use a non-hybrid (e.g. BP86 instead of B3LYP), use a very small QM-region (just the bare minimum). Another option is to switch to a semi-empirical level of theory during this phase, and here your best bet is probably to use the xtb-program with either the GFN1-xTB or GFN2-XTB method since you have Zn. Once you have been able to complete this initial optimization of the new system, things will be much easier. You can switch to your desired theory-level again and perhaps a larger QM-region and you are likely to reconverge in less than 100 iterations. When you change the geometry of the QM-region, e.g. exploring another isomer or conformer, you are also likely to converge in less than 100 iterations. It is just in this first stage where things will really take a long time, or if you increase the active region.

Hi,Ragnar! I used your suggestions to optimize, but even if I used a smaller qm region, the system did not converge well and the energy kept rising (btw, should I look at the overall energy or just the qm energy), I really don't know how to solve it, could you please help me check it out and give me some suggestions? Below are my input and output files. I used orca as the qm calculation tool. test.zip

Zuttergutao avatar Nov 18 '24 23:11 Zuttergutao

Yeah, there is something strange going on. The energy should not be going up like this. I will take a closer look.

RagnarB83 avatar Nov 19 '24 13:11 RagnarB83

Yeah, there is something strange going on. The energy should not be going up like this. I will take a closer look.

Thank you. I have also been looking for the reason. I guess it is related to the phosphate group in the active area, but i am not sure. I tried various optimizations and found that the energy of qm will gradually stabilize, but the energy of mm changes greatly, resulting in a very large gradient.

Looking forward to your good news.

Best wishes.

Zuttergutao avatar Nov 19 '24 14:11 Zuttergutao

Yes, so I suspect something is wrong with the MM contribution. If I run the optimization where only the current defined QM-region is allowed to move, like this:

Optimizer(theory=qmmmobject,
            fragment=frag,charge=charge,mult=mult,
            ActiveRegion=True, actatoms=qmatomlist,
            maxiter=500, coordsystem='hdlc')

then the system seems to behave normally (energy gets lower) while I get a rising energy if the actatoms list is larger. This suggests to me that one of the MM residues is responsible, as by freezing it in place the problem goes away.

Was it possible to run the system in Amber directly without any problems ? Were any constraints employed if so?

RagnarB83 avatar Nov 19 '24 14:11 RagnarB83

Yes, so I suspect something is wrong with the MM contribution. If I run the optimization where only the current defined QM-region is allowed to move, like this:

Optimizer(theory=qmmmobject,
            fragment=frag,charge=charge,mult=mult,
            ActiveRegion=True, actatoms=qmatomlist,
            maxiter=500, coordsystem='hdlc')

then the system seems to behave normally (energy gets lower) while I get a rising energy if the actatoms list is larger. This suggests to me that one of the MM residues is responsible, as by freezing it in place the problem goes away.

Was it possible to run the system in Amber directly without any problems ? Were any constraints employed if so?

My system was optimized using amber, and then a stable frame conformation was extracted. The QM program built into amber was used to minimize the qm region using the PM6 semi-empirical method. Everything was fine. I had encountered a situation in the previous md simulation (other project) where the phosphate group caused the system to collapse, so I guess it might be the two phosphate groups on the cofactor that caused the problem, but I'm not sure. The system has no non-standard residues, was built using tleap, the system was minimized, and the finial simulation was normal. Currently I can't find the real reason, could you help me solve it? Thanks.

Zuttergutao avatar Nov 19 '24 14:11 Zuttergutao

Hi! I want to use ASH for transition state search and transition state structure optimization. Here are some problems:

  1. Can I only use the NEB method for transition state search?
  2. Are the methods and keywords for transition state structure optimization and conventional geometry optimization the same?

Zuttergutao avatar Nov 30 '24 07:11 Zuttergutao

Hi, I'm still looking into your first issue but it is tricky and most likely a problem in the MM representation of a particular residue. A workaround for now is to not have an active MM region, i.e. let the QM-region and active region be identical.

Regarding your TS search question. NEB is a minimum energy path method as well as a saddlepoint-finder.

To find a saddlepoint precisely one can in principle use the TSOpt feature in the interface to the geometric library which activates a Quasi-Newton eigenvector-following algorithm. It requires a choice for Hessian and is documented here: https://ash.readthedocs.io/en/latest/job-types.html#saddle-point-optimization and https://ash.readthedocs.io/en/latest/Geometry-optimization.html

So basically you just have to do Optimizer(theory=theory, fragment=fragment, TSOpt=True) but you also have to be aware of that by default the exact Hessian of the full active region is calculated in the first step (can be expensive).

The issue with this approach is that it only works if you have a very good guess for the TS geometry already (this is rarely the case) and you have to either calculate an expensive Hessian in the beginning or alternatively read in an approximate Hessian file from another source.

NEB and NEBTS are methods that help you find the saddlepoint geometry when it is not easily estimated. https://ash.readthedocs.io/en/latest/neb.html

By minimizing a minimum energy path between a reactant and product geometry (2 local minima) the saddlepoint can be either directly calculated using the climbing-image NEB method (NEB-CI) or via a combination of NEB-CI+eigenvector-following-algorithm (called NEBTS). NEB-CI is quite reliable but is time-consuming to converge completely and has a small tendency to end up in a saddlepoint that has more than 1 imaginary mode. NEBTS works by only partially converging the NEB-CI job and then automatically switches to the eigenvector-following algorithm.

You can read about these methods in this paper: https://pubs.acs.org/doi/10.1021/acs.jctc.1c00462 The ASH version of NEB-CI is practically identical while the NEBTS version differs slightly regarding the Hessian used.

RagnarB83 avatar Dec 02 '24 14:12 RagnarB83

Hi, I'm still looking into your first issue but it is tricky and most likely a problem in the MM representation of a particular residue. A workaround for now is to not have an active MM region, i.e. let the QM-region and active region be identical.

Regarding your TS search question. NEB is a minimum energy path method as well as a saddlepoint-finder.

To find a saddlepoint precisely one can in principle use the TSOpt feature in the interface to the geometric library which activates a Quasi-Newton eigenvector-following algorithm. It requires a choice for Hessian and is documented here: https://ash.readthedocs.io/en/latest/job-types.html#saddle-point-optimization and https://ash.readthedocs.io/en/latest/Geometry-optimization.html

So basically you just have to do Optimizer(theory=theory, fragment=fragment, TSOpt=True) but you also have to be aware of that by default the exact Hessian of the full active region is calculated in the first step (can be expensive).

The issue with this approach is that it only works if you have a very good guess for the TS geometry already (this is rarely the case) and you have to either calculate an expensive Hessian in the beginning or alternatively read in an approximate Hessian file from another source.

NEB and NEBTS are methods that help you find the saddlepoint geometry when it is not easily estimated. https://ash.readthedocs.io/en/latest/neb.html

By minimizing a minimum energy path between a reactant and product geometry (2 local minima) the saddlepoint can be either directly calculated using the climbing-image NEB method (NEB-CI) or via a combination of NEB-CI+eigenvector-following-algorithm (called NEBTS). NEB-CI is quite reliable but is time-consuming to converge completely and has a small tendency to end up in a saddlepoint that has more than 1 imaginary mode. NEBTS works by only partially converging the NEB-CI job and then automatically switches to the eigenvector-following algorithm.

You can read about these methods in this paper: https://pubs.acs.org/doi/10.1021/acs.jctc.1c00462 The ASH version of NEB-CI is practically identical while the NEBTS version differs slightly regarding the Hessian used.

I truly appreciate the time and effort you've dedicated to resolving these issues for me. In fact, I only use the qm region as the active region currently and no error was given. In my experience, ASH is considerably more user-friendly compared to ChemShell. I would like to express my gratitude once more for creating such a valuable tool.

Zuttergutao avatar Dec 02 '24 23:12 Zuttergutao

Thanks, glad to hear it

RagnarB83 avatar Dec 03 '24 07:12 RagnarB83