cclib icon indicating copy to clipboard operation
cclib copied to clipboard

Test and add support for ORCA6

Open Andrew-S-Rosen opened this issue 1 year ago • 4 comments

ORCA 6 came out today, which means we need to add test files and add support for it. This issue is just a reminder of that.

It looks like there is now a machine-readable "property" file written out for (almost) all calculations by default, which will probably help a lot. This should be a lot less fragile than the regular log file. Note that there is no property file for MD and L-Opt calculations (by default? unclear how to enable it if this is the case) according to the manual. There is also now a mechanism to convert the property file into a JSON file via orca_2json.

Here is a minimal ORCA 6 example I ran: minimal.zip

Unfortunately, as expected, cclib can't parse things out of the box:

from cclib.io import ccread

ccread("orca.out") # from minimal.zip
[ORCA orca.out ERROR] Encountered error when parsing.
[ORCA orca.out ERROR] Last line read:   Last RMS-Density change    ...    1.4962e-06  Tolerance :  
 1.0000e-06
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/scratch/network/asrosen/software/miniconda/envs/quacc/lib/python3.11/site-packages/cclib/i
o/ccio.py", line 185, in ccread
    return log.parse()
           ^^^^^^^^^^^
  File "/scratch/network/asrosen/software/miniconda/envs/quacc/lib/python3.11/site-packages/cclib/p
arser/logfileparser.py", line 165, in parse
    self.extract(self.inputfile, line)
  File "/scratch/network/asrosen/software/miniconda/envs/quacc/lib/python3.11/site-packages/cclib/p
arser/orcaparser.py", line 545, in extract
    self._append_scfvalues_scftargets(inputfile, line)
  File "/scratch/network/asrosen/software/miniconda/envs/quacc/lib/python3.11/site-packages/cclib/p
arser/orcaparser.py", line 2799, in _append_scfvalues_scftargets
    self.scfvalues[-1].append([deltaE_value, maxDP_value, rmsDP_value])
    ~~~~~~~~~~~~~~^^^^
IndexError: list index out of range

Andrew-S-Rosen avatar Jul 25 '24 13:07 Andrew-S-Rosen

Is anyone actively working on this? Is the plan to support only 2.0 or also support 1.x?

ATenderholt avatar Oct 17 '24 15:10 ATenderholt

Is anyone actively working on this? Is the plan to support only 2.0 or also support 1.x?

None of the maintainers are AFAIK. We haven't even discussed it. My personal opinion is that, if it does not require a full rewrite of the current 1.x parser (I haven't looked at the output), then if someone wants to add it to 1.x, they can. It will take a while to finish 2.0 as we incrementally port things over and it will ~soon be possible to make further releases on 1.x.

berquist avatar Oct 17 '24 15:10 berquist

I might take a stab at it then using the 1.x parser. Have there been breaking changes on master since 1.8.1?

ATenderholt avatar Oct 18 '24 02:10 ATenderholt

I might take a stab at it then using the 1.x parser. Have there been breaking changes on master since 1.8.1?

Yes, but I'm going to revert them ASAP (#1482).

berquist avatar Oct 19 '24 20:10 berquist

Hi folks, I might also work on this if that's alright. I've already added a few bits based on @ATenderholt's branch (hope you don't mind). Mine's at https://github.com/oliver-s-lee/cclib/tree/orca6

Perhaps we can setup a new temp branch on the main cclib github to colllab on?

oliver-s-lee avatar Dec 07 '24 09:12 oliver-s-lee

I'd be happy if you took it over. I haven't had as much time to work on it as I thought I would.

ATenderholt avatar Dec 07 '24 14:12 ATenderholt

Happy to help too, I cloned @oliver-s-lee branch and fixed the scf parsing issue + dipole moment parsing and added the quadrupole moment parsing for now. Any outputs log file would be extremely helpful as per my use case it is now correct (MBIS, Quadrupole, gradient)

FNTwin avatar Dec 09 '24 17:12 FNTwin

Thanks @FNTwin, the more the merrier!

Just for everyone's benefit, I believe this is the minimum list of tests we'd be looking to have passing (based on https://cclib.github.io/development.html and a few other important features we added for Orca 5.x):

  • [x] Restricted single point HF/STO-3G (dvb)
  • [x] Restricted single point (large basis) HF/aug-cc-pCVQZ (C)
  • [x] Restricted single point B3LYP/STO-3G (dvb)
  • [x] Unrestricted single point HF/STO-3G (dvb)
  • [x] Unrestricted single point B3LYP/STO-3G (dvb)
  • [x] Opt B3LYP/STO-3G (dvb)
  • [x] Scan B3LYP/STO-3G (dvb)
  • [x] Frequency B3LYP/STO-3G (dvb)
  • [x] Raman B3LYP/STO-3G (dvb)
  • [x] TD-DFT excited states B3LYP/STO-3G (dvb)
  • [x] ROCIS HF/STO-3G (dvb)
  • [x] MP2 single point STO-3G (water)
  • [x] MP3 single point STO-3G (water)
  • [x] CCSD single point STO-3G (water)
  • [x] CCSD(T) single point STO-3G (water)
  • [x] CC excited states EOM-CCSD/STO-3G (dvb)
  • [x] NMR shift and coupling B3LYP/STO-3G (dvb)
  • [x] Solvent (will need to check manual for what Orca 6 supports, but probably CPCM)
  • [x] Static polarizabilities HF/STO-3G (tryptophan)

I'll work on generating the necessary log files over the next few days. If anyone fancies tackling anything in particular just shout!

oliver-s-lee avatar Dec 10 '24 10:12 oliver-s-lee

Solvent (will need to check manual for what Orca 6 supports, but probably CPCM)

I can take this. Done

Unrestricted single point HF/STO-3G (dvb)

Also this. Done

Restricted single point (large basis) HF/aug-cc-pCVQZ (C)

Also this without issue. Done

Restricted single point HF/STO-3G (dvb)

Done

Static polarizabilities HF/STO-3G (tryptophan)

Done

FNTwin avatar Dec 12 '24 18:12 FNTwin

Thanks @FNTwin, much appreciated!

I think that's everything we should need for now. Would you be able to open a pull-request against https://github.com/cclib/cclib/tree/orca6 with your changes and then we can begin merging together.

oliver-s-lee avatar Dec 13 '24 09:12 oliver-s-lee

I have tried to use cclib for the parsing of ORCA 6.0.1 output files of frequency calculations, by installing the version from the orca6 branch. Interestingly, while the test file in the repository is parsed without issues, my own output file could not be read. And produced the following error:

[ORCA freq_test[/freq.out](http://localhost:8888/freq.out) ERROR] Encountered error when parsing.
[ORCA freq_test[/freq.out](http://localhost:8888/freq.out) ERROR] Last line read:       3       0.000000   0.000000   0.000000   0.000000   0.000000   0.000000

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[4], line 1
----> 1 data = cclib.io.ccread("freq_test[/freq.out](http://localhost:8888/freq.out)")

File [~/opt/cclib/cclib/io/ccio.py:185](http://localhost:8888/lab/workspaces/~/opt/cclib/cclib/io/ccio.py#line=184), in ccread(source, *args, **kwargs)
    182 if log:
    183     logger.info("Identified logfile to be in %s format", type(log).__name__)
--> 185     return log.parse()
    186 else:
    187     logger.info("Attempting to use fallback mechanism to read file")

File [~/opt/cclib/cclib/parser/logfileparser.py:162](http://localhost:8888/lab/workspaces/~/opt/cclib/cclib/parser/logfileparser.py#line=161), in Logfile.parse(self, progress, fupdate, cupdate)
    157 # This call should check if the line begins a section of extracted data.
    158 # If it does, it parses some lines and sets the relevant attributes (to self).
    159 # Any attributes can be freely set and used across calls, however only those
    160 #   in data._attrlist will be moved to final data object that is returned.
    161 try:
--> 162     self.extract(self.inputfile, line)
    163 except StopParsing:
    164     # This is fine
    165     break

File [~/opt/cclib/cclib/parser/orcaparser.py:2243](http://localhost:8888/lab/workspaces/~/opt/cclib/cclib/parser/orcaparser.py#line=2242), in ORCA.extract(self, inputfile, line)
   2240     _irreps = next(inputfile)
   2242 for atom in range(self.natom):
-> 2243     all_vibdisps[mode : mode + matrix_columns, atom, 0] = next(
   2244         inputfile
   2245     ).split()[1:]
   2246     all_vibdisps[mode : mode + matrix_columns, atom, 1] = next(
   2247         inputfile
   2248     ).split()[1:]
   2249     all_vibdisps[mode : mode + matrix_columns, atom, 2] = next(
   2250         inputfile
   2251     ).split()[1:]

ValueError: could not broadcast input array from shape (6,) into shape (10,)

Which to me seems due to the fact that in the test file 10 normal modes are printed next to each other, while in my output file only 6 normal modes are printed next to each other. I have attached the output file for my frequency calculation for reference.

freq.txt

O2-AC avatar Apr 03 '25 20:04 O2-AC

Hi @O2-AC, thanks for the bug report! You're right that the width of the matrix is the issue, I was under the impression that Orca 6 always used the wide table format but clearly I got that wrong. Perhaps the wide table is only used when symmetry is turned on. Anyway, I should have a fix soon.

oliver-s-lee avatar Apr 09 '25 14:04 oliver-s-lee

Addressed in aabf6cd

oliver-s-lee avatar Apr 10 '25 07:04 oliver-s-lee

Addressed in aabf6cd

Yep, confirmed.

This should be all set with https://github.com/cclib/cclib/pull/1517.

berquist avatar Apr 13 '25 18:04 berquist