cclib
cclib copied to clipboard
Test and add support for ORCA6
ORCA 6 came out today, which means we need to add test files and add support for it. This issue is just a reminder of that.
It looks like there is now a machine-readable "property" file written out for (almost) all calculations by default, which will probably help a lot. This should be a lot less fragile than the regular log file. Note that there is no property file for MD and L-Opt calculations (by default? unclear how to enable it if this is the case) according to the manual. There is also now a mechanism to convert the property file into a JSON file via orca_2json.
Here is a minimal ORCA 6 example I ran: minimal.zip
Unfortunately, as expected, cclib can't parse things out of the box:
from cclib.io import ccread
ccread("orca.out") # from minimal.zip
[ORCA orca.out ERROR] Encountered error when parsing.
[ORCA orca.out ERROR] Last line read: Last RMS-Density change ... 1.4962e-06 Tolerance :
1.0000e-06
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/scratch/network/asrosen/software/miniconda/envs/quacc/lib/python3.11/site-packages/cclib/i
o/ccio.py", line 185, in ccread
return log.parse()
^^^^^^^^^^^
File "/scratch/network/asrosen/software/miniconda/envs/quacc/lib/python3.11/site-packages/cclib/p
arser/logfileparser.py", line 165, in parse
self.extract(self.inputfile, line)
File "/scratch/network/asrosen/software/miniconda/envs/quacc/lib/python3.11/site-packages/cclib/p
arser/orcaparser.py", line 545, in extract
self._append_scfvalues_scftargets(inputfile, line)
File "/scratch/network/asrosen/software/miniconda/envs/quacc/lib/python3.11/site-packages/cclib/p
arser/orcaparser.py", line 2799, in _append_scfvalues_scftargets
self.scfvalues[-1].append([deltaE_value, maxDP_value, rmsDP_value])
~~~~~~~~~~~~~~^^^^
IndexError: list index out of range
Is anyone actively working on this? Is the plan to support only 2.0 or also support 1.x?
Is anyone actively working on this? Is the plan to support only 2.0 or also support 1.x?
None of the maintainers are AFAIK. We haven't even discussed it. My personal opinion is that, if it does not require a full rewrite of the current 1.x parser (I haven't looked at the output), then if someone wants to add it to 1.x, they can. It will take a while to finish 2.0 as we incrementally port things over and it will ~soon be possible to make further releases on 1.x.
I might take a stab at it then using the 1.x parser. Have there been breaking changes on master since 1.8.1?
I might take a stab at it then using the 1.x parser. Have there been breaking changes on
mastersince 1.8.1?
Yes, but I'm going to revert them ASAP (#1482).
Hi folks, I might also work on this if that's alright. I've already added a few bits based on @ATenderholt's branch (hope you don't mind). Mine's at https://github.com/oliver-s-lee/cclib/tree/orca6
Perhaps we can setup a new temp branch on the main cclib github to colllab on?
I'd be happy if you took it over. I haven't had as much time to work on it as I thought I would.
Happy to help too, I cloned @oliver-s-lee branch and fixed the scf parsing issue + dipole moment parsing and added the quadrupole moment parsing for now. Any outputs log file would be extremely helpful as per my use case it is now correct (MBIS, Quadrupole, gradient)
Thanks @FNTwin, the more the merrier!
Just for everyone's benefit, I believe this is the minimum list of tests we'd be looking to have passing (based on https://cclib.github.io/development.html and a few other important features we added for Orca 5.x):
- [x] Restricted single point HF/STO-3G (dvb)
- [x] Restricted single point (large basis) HF/aug-cc-pCVQZ (C)
- [x] Restricted single point B3LYP/STO-3G (dvb)
- [x] Unrestricted single point HF/STO-3G (dvb)
- [x] Unrestricted single point B3LYP/STO-3G (dvb)
- [x] Opt B3LYP/STO-3G (dvb)
- [x] Scan B3LYP/STO-3G (dvb)
- [x] Frequency B3LYP/STO-3G (dvb)
- [x] Raman B3LYP/STO-3G (dvb)
- [x] TD-DFT excited states B3LYP/STO-3G (dvb)
- [x] ROCIS HF/STO-3G (dvb)
- [x] MP2 single point STO-3G (water)
- [x] MP3 single point STO-3G (water)
- [x] CCSD single point STO-3G (water)
- [x] CCSD(T) single point STO-3G (water)
- [x] CC excited states EOM-CCSD/STO-3G (dvb)
- [x] NMR shift and coupling B3LYP/STO-3G (dvb)
- [x] Solvent (will need to check manual for what Orca 6 supports, but probably CPCM)
- [x] Static polarizabilities HF/STO-3G (tryptophan)
I'll work on generating the necessary log files over the next few days. If anyone fancies tackling anything in particular just shout!
Solvent (will need to check manual for what Orca 6 supports, but probably CPCM)
I can take this. Done
Unrestricted single point HF/STO-3G (dvb)
Also this. Done
Restricted single point (large basis) HF/aug-cc-pCVQZ (C)
Also this without issue. Done
Restricted single point HF/STO-3G (dvb)
Done
Static polarizabilities HF/STO-3G (tryptophan)
Done
Thanks @FNTwin, much appreciated!
I think that's everything we should need for now. Would you be able to open a pull-request against https://github.com/cclib/cclib/tree/orca6 with your changes and then we can begin merging together.
I have tried to use cclib for the parsing of ORCA 6.0.1 output files of frequency calculations, by installing the version from the orca6 branch. Interestingly, while the test file in the repository is parsed without issues, my own output file could not be read. And produced the following error:
[ORCA freq_test[/freq.out](http://localhost:8888/freq.out) ERROR] Encountered error when parsing.
[ORCA freq_test[/freq.out](http://localhost:8888/freq.out) ERROR] Last line read: 3 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[4], line 1
----> 1 data = cclib.io.ccread("freq_test[/freq.out](http://localhost:8888/freq.out)")
File [~/opt/cclib/cclib/io/ccio.py:185](http://localhost:8888/lab/workspaces/~/opt/cclib/cclib/io/ccio.py#line=184), in ccread(source, *args, **kwargs)
182 if log:
183 logger.info("Identified logfile to be in %s format", type(log).__name__)
--> 185 return log.parse()
186 else:
187 logger.info("Attempting to use fallback mechanism to read file")
File [~/opt/cclib/cclib/parser/logfileparser.py:162](http://localhost:8888/lab/workspaces/~/opt/cclib/cclib/parser/logfileparser.py#line=161), in Logfile.parse(self, progress, fupdate, cupdate)
157 # This call should check if the line begins a section of extracted data.
158 # If it does, it parses some lines and sets the relevant attributes (to self).
159 # Any attributes can be freely set and used across calls, however only those
160 # in data._attrlist will be moved to final data object that is returned.
161 try:
--> 162 self.extract(self.inputfile, line)
163 except StopParsing:
164 # This is fine
165 break
File [~/opt/cclib/cclib/parser/orcaparser.py:2243](http://localhost:8888/lab/workspaces/~/opt/cclib/cclib/parser/orcaparser.py#line=2242), in ORCA.extract(self, inputfile, line)
2240 _irreps = next(inputfile)
2242 for atom in range(self.natom):
-> 2243 all_vibdisps[mode : mode + matrix_columns, atom, 0] = next(
2244 inputfile
2245 ).split()[1:]
2246 all_vibdisps[mode : mode + matrix_columns, atom, 1] = next(
2247 inputfile
2248 ).split()[1:]
2249 all_vibdisps[mode : mode + matrix_columns, atom, 2] = next(
2250 inputfile
2251 ).split()[1:]
ValueError: could not broadcast input array from shape (6,) into shape (10,)
Which to me seems due to the fact that in the test file 10 normal modes are printed next to each other, while in my output file only 6 normal modes are printed next to each other. I have attached the output file for my frequency calculation for reference.
Hi @O2-AC, thanks for the bug report! You're right that the width of the matrix is the issue, I was under the impression that Orca 6 always used the wide table format but clearly I got that wrong. Perhaps the wide table is only used when symmetry is turned on. Anyway, I should have a fix soon.
Addressed in aabf6cd
Addressed in aabf6cd
Yep, confirmed.
This should be all set with https://github.com/cclib/cclib/pull/1517.