QUBEKit
QUBEKit copied to clipboard
MissingRfreeError: The following elements have no reference Rfree values
Hi,
with the following molecule:
Oc1ccc(cc1)CC[C@@]1(OC(=O)C(=C(O)C1)Sc1cc(c(N)cc1C(C)(C)C)C)C1CCCC1
I get the error message
qubekit.utils.exceptions.MissingRfreeError: The following elements have no reference Rfree values which are required to parameterise the molecule {'s'}
which I assume is becaue of the S in the molecule ? Is this chanceless or do you have any pointers on how I could proceed to get to an parametrization ?
Best regards, Ben
Hi @entropybit,
Not all of the parameterisation protocols we published contain parameters for S and halogens only 5b and 5d
do you can learn more about the differences between these in the paper. You can select a different protocol than the default by adding a flag like -p 5b
to the run command.
Hope this helps!
Okay, after the changes in the other issue I was hopefully that this could work I tried the following:
qubekit run -sm "Oc1ccc(cc1)CC[C@@]1(OC(=O)C(=C(O)C1)Sc1cc(c(N)cc1C(C)(C)C)C)C1CCCC1" -n 1OS5 --cores 96 --memory 200 -p 5b
which in the end produced this error:
Warning (not error because allow_undefined_stereo=True): Unable to make OFFMol from RDMol: RDMol has unspecified stereochemistry. Undefined chiral centers are
:
- Atom C (index 7)
Stereochemistry for atom 35 flipped from S to None
If QUBEKit ever breaks or you would like to view timings and loads of other info, view the log file.
Our documentation (README.md) also contains help on handling the various commands for QUBEKit.
Fragmenting molecule using the WBOFragmenter
Traceback (most recent call last):
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/workflow/workflow.py", line 348, in _run_stage
result_mol = stage.run(
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/fragmentation/wbo_fragmenter.py", line 71, in run
return self._run(molecule, *args, **kwargs)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/fragmentation/wbo_fragmenter.py", line 83, in _run
depict_fragmentation_result(
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/openff/fragmenter/depiction.py", line 307, in depict_fragmentation_result
depict_fragments(
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/openff/fragmenter/depiction.py", line 332, in depict_fragments
header_svg = _oe_render_parent(parent, [*fragments])
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/openff/fragmenter/depiction.py", line 92, in _oe_render_parent
oe_parent = parent.to_openeye()
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/openff/toolkit/utils/base_wrapper.py", line 50, in wrapped_function
raise ToolkitUnavailableException(msg)
openff.toolkit.utils.exceptions.ToolkitUnavailableException: This function requires the OpenEye Toolkit toolkit
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/bin/qubekit", line 8, in <module>
sys.exit(cli())
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/cli/run.py", line 136, in run
workflow.new_workflow(molecule=molecule, skip_stages=skip_stages, end=end)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/workflow/workflow.py", line 272, in new_workflow
return self._run_workflow(
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/workflow/workflow.py", line 310, in _run_workflow
molecule = self._run_stage(
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/workflow/workflow.py", line 372, in _run_stage
raise WorkFlowExecutionError(
qubekit.utils.exceptions.WorkFlowExecutionError: The workflow stopped unexpectedly due to the following error at stage: fragmentation
I'll attach you a zip of the results folder again. 1os5.zip
Guess I'll also try out 5d and see if the same happens now.
@jthorton
Hmh, no same error with 5d. And now that I read this (error messages above) some more detailed, this seems to be because openeye toolkit not being available. Since I installed it I assume this has to do with a missing license or something. Did I install the wrong dependency or something I don't recall reading about this in your documentation at all....
Edit: Okay after just uninstalling openeye-toolkit it now seems to be working. (still running, we will see if it doesn't manage to produce some other errors ^^). But I find it a bit worrisome that qubekit stops if this package is available but the licencse is missing (at least I assume that is what causes it). You should probably build something in that catches openff.toolkit.utils.exceptions.ToolkitUnavailableException
to default back to rdkit.
Ok, @jthorton I got some new errors:
^MOptimising conformer: 0%| | 0/3 [00:00<?, ?it/s]^MOptimising conformer: 0%| | 0/3 [04:24<?, ?it/s]
Molecule optimisation complete.
Calculating Hessian matrix.
Traceback (most recent call last):
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/workflow/workflow.py", line 348, in _run_stage
result_mol = stage.run(
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/utils/datastructures.py", line 217, in run
molecule = self._run(molecule, *args, **kwargs)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/workflow/helper_stages.py", line 63, in _run
raise HessianCalculationFailed(
qubekit.utils.exceptions.HessianCalculationFailed: The hessian calculation failed for 1OS5 please check the result json.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/bin/qubekit", line 8, in <module>
sys.exit(cli())
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/cli/run.py", line 136, in run
workflow.new_workflow(molecule=molecule, skip_stages=skip_stages, end=end)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/workflow/workflow.py", line 272, in new_workflow
return self._run_workflow(
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/workflow/workflow.py", line 310, in _run_workflow
molecule = self._run_stage(
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/workflow/workflow.py", line 372, in _run_stage
raise WorkFlowExecutionError(
qubekit.utils.exceptions.WorkFlowExecutionError: The workflow stopped unexpectedly due to the following error at stage: hessian
What I immediately notice is that thee is no qm_optimization subfolder, the 5d/5b protocols should still have that right ? Anyways I have added the entire folder as zip again. 1os5_2.zip
Hi @jthorton, as a small extension/comment on that I tried out other molecules to (without the -p 5d specification just the normal protocoll) and I basically always have this in the QUBEKit.err file:
Traceback (most recent call last):
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/workflow/workflow.py", line 348, in _run_stage
result_mol = stage.run(
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/utils/datastructures.py", line 217, in run
molecule = self._run(molecule, *args, **kwargs)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/workflow/helper_stages.py", line 63, in _run
raise HessianCalculationFailed(
qubekit.utils.exceptions.HessianCalculationFailed: The hessian calculation failed for NU7026 please check the result json.
I think this is the same error but if you think it helps I can also zip you one of those other result folders.
Hi @entropybit, thanks for the report. For the fragmentation issue it looks like an issue in openff-fragmenter where it just checks that the package is installed when trying to depict the fragments with openeye, you are right that this shouldn't stop the workflow if it is missing a license Ill try and get this fixed there.
For the hessian stage it seems to be an issue with gaussian calculation, in the zipped folder there is a gaussian.com file in the hessian folder which is generated by qubekit to run the calculation and a qcengine_result.json
file which has the error output. I am not familiar with the error message in the file see below, but a quick google seems to suggest its a memory issue and should be fixed by providing more memory to the calculation.
End of G2Drv F.D. properties file 721 does not exist.
End of G2Drv F.D. properties file 722 does not exist.
End of G2Drv F.D. properties file 788 does not exist.
IDoAtm=11111111111111111111111111111111111111111111111111
IDoAtm=1111111111111111111111
Differentiating once with respect to electric field.
with respect to dipole field.
Differentiating once with respect to nuclear coordinates.
CalDSu exits because no D1Ps are significant.
Erroneous write. Write -1 instead of 74297640.
fd = 4\norig len = 124105728 left = 74297640
Hey, I got the same error again after running it on our large memory node, so used the following specification:
qubekit run -sm "Oc1ccc(cc1)CC[C@@]1(OC(=O)C(=C(O)C1)Sc1cc(c(N)cc1C(C)(C)C)C)C1CCCC1" -n 1OS5 --cores 96 --memory 1200 -p 5d
1.2TB are apperently not enough if you are correct and this is a memory issue. Can it be anything else ?
Before I ran it with --memory 200
which is already not that little RAM (if this is indeed in GB).
Best regards, Ben
PS: As before I have attached an zip of my according output folder. 1OS5_2023_04_12.zip
PPS: If this turns out to really be something about how the tools / algorithms work .... Any chance that we can run this as distributed computation ?
Hey, I am not sure what the issue is with this then as the error message from gaussian is very cryptic! My only thought is that the optimisation works fine but the hessian fails and the only extra thing we do in the hessian stage is calculate the bond order. Can you try removing the bond order comands in the gaussian input file and run it manually to see if that works? (It would be strange if this is the issue as you reported being able to run a molecule all the way through before which should have included this step). Maybe also try removing parts of the input file/change the cores and memeory settings till you find a working combination?
Lastly, Ill try and run this locally as well although I only have access to gaussian09 but if it is a memory issue I think I should see the same thing.
Hey, I managed to run this exclusively on our 44 core node with 100GB of memory and it finished in around an hour, I used the input file from your zip folder and only changed the cores and memory lines so it might be something to do with the calculation setup. Did you have any luck running it manually?
Thanks for that, I might take a bit longer with trying it out this time, just to let you know. But by next week I should have been able to do it. If it works: I assume I can simply continue then using --skip-stages or restart ?
Okay, this is very weird. I did try it out with a batch job for running the gaussian file only. This was my SLURM batch job
#!/bin/bash
#SBATCH -J g16
#SBATCH --mail-type=END
#SBATCH -n 96 # number of processes (= total cores to use, here: 2 nodes <C3><A0> 96 cores)
#SBATCH --mem-per-cpu=2500 # required main memory in MByte per MPI task/process
#SBATCH -t 24:00:00 # in hours, minutes and seconds, or '#SBATCH -t 10' - just minutes
#SBATCH -p project02019
module load gaussian/g16
g16 < gaussian.com > g16.log
and I get
Lmod: unloading gaussian g16
Lmod: loading gaussian g16
galloc: could not allocate memory.: Cannot allocate memory
Error: segmentation violation
rax 0000000000000000, rbx 0000000005ce8cd0, rcx 0000150143caf55b
rdx 0000000000000000, rsp 00007fff7541ff68, rbp 0000000006def2a0
rsi 000000000000000b, rdi 000000000021cf00, r8 000015014404e860
r9 0000150144ba9100, r10 0000000000000009, r11 0000000000000202
r12 00007fff75420000, r13 0000000022b84e90, r14 0000000005cc04b0
r15 0000000005cc04b0
/lib64/libpthread.so.0(+0x12cf0) [0x1501443e6cf0]
/lib64/libc.so.6(kill+0xb) [0x150143caf55b]
/shared/apps/gaussian/g16/l101.exe() [0x4af45a]
/shared/apps/gaussian/g16/l101.exe() [0x4ef52a]
I'll attach the gaussian.com file as well as log. gaussian.zip
The nodes have 96 cores and ~250GB so this should be enough...
Okay, this is very weird. I did try it out with a batch job for running the gaussian file only. This was my SLURM batch job
#!/bin/bash #SBATCH -J g16 #SBATCH --mail-type=END #SBATCH -n 96 # number of processes (= total cores to use, here: 2 nodes <C3><A0> 96 cores) #SBATCH --mem-per-cpu=2500 # required main memory in MByte per MPI task/process #SBATCH -t 24:00:00 # in hours, minutes and seconds, or '#SBATCH -t 10' - just minutes #SBATCH -p project02019 module load gaussian/g16 g16 < gaussian.com > g16.log
and I get
Lmod: unloading gaussian g16 Lmod: loading gaussian g16 galloc: could not allocate memory.: Cannot allocate memory Error: segmentation violation rax 0000000000000000, rbx 0000000005ce8cd0, rcx 0000150143caf55b rdx 0000000000000000, rsp 00007fff7541ff68, rbp 0000000006def2a0 rsi 000000000000000b, rdi 000000000021cf00, r8 000015014404e860 r9 0000150144ba9100, r10 0000000000000009, r11 0000000000000202 r12 00007fff75420000, r13 0000000022b84e90, r14 0000000005cc04b0 r15 0000000005cc04b0 /lib64/libpthread.so.0(+0x12cf0) [0x1501443e6cf0] /lib64/libc.so.6(kill+0xb) [0x150143caf55b] /shared/apps/gaussian/g16/l101.exe() [0x4af45a] /shared/apps/gaussian/g16/l101.exe() [0x4ef52a]
I'll attach the gaussian.com file as well as log. gaussian.zip
The nodes have 96 cores and ~250GB so this should be enough...
I think I found the problem, if I open the gaussian.com file in the first line it say %Mem 1200GB which of course none of our normal nodes can fulfill. How and from which parameters is the input number of cores and memory generated ? This is simply the valus from --cores and --memory right ?
EDIT: Ok the 1200GB came from me running it on the large mem node, but it should have worked then, very weird indeed...
Still don't understandt why the gaussian run didn't work before (maybe I really did something wrong with specifying to much memory but I don't think so). However when restarting I get errors again ^^'
Calculating charges using chargemol and ddec6.
Charges calculated and AIM reference data stored.
Fitting virtual site positions and charges.
Virtual sites optimised and saved.
Calculating Lennard-Jones parameters for a 12-6 potential.
Lennard-Jones 12-6 parameters calculated.
Calculating new bond and angle parameters with the modified Seminario method.
Traceback (most recent call last):
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/workflow/workflow.py", line 348, in _run_stage
result_mol = stage.run(
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/utils/datastructures.py", line 217, in run
molecule = self._run(molecule, *args, **kwargs)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/bonded/mod_seminario.py", line 229, in _run
hessian *= conversion
TypeError: unsupported operand type(s) for *=: 'NoneType' and 'float'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/bin/qubekit", line 8, in <module>
sys.exit(cli())
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/cli/run.py", line 172, in restart
workflow.restart_workflow(
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/workflow/workflow.py", line 247, in restart_workflow
return self._run_workflow(molecule=molecule, results=result, workflow=run_order)
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/workflow/workflow.py", line 310, in _run_workflow
molecule = self._run_stage(
File "/work/scratch/b_mayer/miniconda3/envs/qubekit/lib/python3.10/site-packages/qubekit/workflow/workflow.py", line 372, in _run_stage
raise WorkFlowExecutionError(
qubekit.utils.exceptions.WorkFlowExecutionError: The workflow stopped unexpectedly due to the following error at stage: bonded_parameters
I did restart like this after doing a cd to the QUBEKIT_... folder:
qubekit restart --cores 96 --memory 240 -r workflow_result.json charges
@jthorton is there something wrong with this ?
@entropybit, it's been long since you asked your question, but I believe that the problem with Gaussian was indeed with memory
You requested 2500 MB per CPU, which is 240 GB for your 96 nodes, but in Gaussian you set %Mem 1200GB
, which is much higher
As for the QUBEKit part, you were trying to restart charges
step, while previously your calculation failed on stage hessian
, so there is no hessian data in workflow_result.json (you can check an attribute hessian
, it has a value of null
), so when QUBEKit tried to multiply hessian by a constant, it got an exception
Consider trying to understand, how much memory Gaussian needs to calculate your molecule, and then specify the value for QUBEKit
@jthorton, I believe that it might be useful in some scenarios to be able to run QUBEKit starting from already computed log/fchk file so it won't need to perform calculations itself