qmcpack icon indicating copy to clipboard operation
qmcpack copied to clipboard

Add Ghost-Atom Support in QMCPACK/NEXUS.

Open rezapputra opened this issue 5 months ago • 3 comments

Feature request: Add ghost‐atom support to PyscfToQmcpack.py, convert4qmc, and NEXUS so that QMCPACK can perform counterpoise BSSE corrections.

Rationale: Basis‐set superposition error (BSSE) can dominate binding‐energy errors. Counterpoise corrections with modest basis sets (e.g. cc-pVTZ, aug-cc-pVDZ) approach the complete‐basis‐set limit (Nakano et al., JCTC 2025) [DOI:10.1021/acs.jctc.4c01631]. Primer about BSSE here.

Current workaround: I apply an ad-hoc patch (ccECP only on real atoms) to reproduce Nakano’s results. Details below; would love native ghost-atom handling instead.

1. To make PySCF’s ghost atom system work with QMCPACK.

The only script that needs modification is PyscfToQmcpack.py, so that it can correctly interpret ghost atoms from the PySCF output.

Since ghost atoms are defined simply by a prefix (e.g. "X-"), they can be handled by adding entries to the IonName dictionary with an atomic number of zero.

In PyscfToQmcpack.py, add:

IonName.update({f"X-{el}": 0 for el in ["H","C"]})

2. Generate & fix QMCPACK XML

Assuming no errors appear from the savetoqmcpack function in PyscfToQmcpack.py, the next step is to use convert4qmc to generate QMCPACK files.

2.1. Edit .structure.xml

The .structure.xml file needs some editing. convert4qmc generates an empty <group> element in the particleset group:

<group name="">
    <parameter name="charge">0</parameter>
    <parameter name="valence">0</parameter>
    <parameter name="atomicnumber">0</parameter>
</group>

When the conversion routine encounters a species with atomic number zero, it creates a blank <group name=""> entry. We’ll replace that placeholder with explicit ghost-atom groups.

For example:

<group name="X-C">
    <parameter name="charge">0</parameter>
    <parameter name="valence">0</parameter>
    <parameter name="atomicnumber">0</parameter>
</group>
<group name="X-H">
    <parameter name="charge">0</parameter>
    <parameter name="valence">0</parameter>
    <parameter name="atomicnumber">0</parameter>
</group>

The attrib name="position" attribute prints the correct atomic positions, including ghost atoms, but attrib name="ionid" leaves out all ghost atoms, so we need to add them again.

For example:

</attrib>
    <attrib name="ionid" datatype="stringArray">
      X-C X-C X-H X-H X-H X-H H C H H H
</attrib>

Verify that the number of <ionid> entries matches <position> rows to avoid misalignment.

2.2. Update <pseudo> in .qmc.in

Modify the ECP part to ensure the <pseudo> blocks in the Hamiltonian section match ccECP specs and that ghost species carry no ccECP entry.

3. Add ghost-atom Jastrow terms

I found that inserting these correlation blocks for each ghost species makes the Jastrow optimization work as expected.

For example:

<!-- 1-body (electron–ghost-nucleus) -->
<correlation elementType="X-C" size="8" rcut="4.0" cusp="0.0">
    <coefficients id="eX-C" type="Array">
    </coefficients>
</correlation>

<!-- 3-body (electron–ghost-nucleus) -->
<correlation ispecies="X-C" especies1="u" especies2="u" isize="3" esize="3" rcut="5.0">
    <coefficients id="uuX-C" type="Array" optimize="yes"/>
</correlation>
<correlation ispecies="X-C" especies1="u" especies2="d" isize="3" esize="3" rcut="5.0">
    <coefficients id="udX-C" type="Array" optimize="yes"/>
</correlation>

This setup yields a variance-to-energy ratio for the ghost-atom calculation that is essentially the same as for the system without ghost atoms.

4. Run DMC (no changes)

No further edits are required for the DMC step. You can use the exact same .qmc.in settings and workflow you’d apply to the standard system—just point to your ghost-augmented wavefunction and structure files. In our tests, the only difference is that the Hamiltonian and Jastrow blocks already include the ghost species, but all other DMC parameters (time step, number of walkers, equilibration steps, etc.) remain identical.

5. Extend NEXUS for ghost system

As you can see from sections 1 to 3, the calculation process is similar to what NEXUS produces for a standard system. Therefore, it was possible for NEXUS to automate everything by editing the NEXUS Python library. Here, I will explain which library I edited and why.

5.1 Recognize ghosts in periodic_table.py

The first error is NEXUS does not recognize ghost atom like X-H or X-C.

on class SimpleElement:

class SimpleElement(DevBase):
    def __init__(self):
    ...
+   self.ghost             = False
    ...
+   def copy(self):
+       new = deepcopy(self)
+       return new

on class PeriodicTable:

 class PeriodicTable(DevBase):
+    GHOST_PREFIX = 'X-'     # or '*' or any convention you like

     element_set=set([
        'Ac','Al','Am','Sb','Ar','As','At','Ba','Bk','Be','Bi','B' ,'Br',
        'Cd','Ca','Cf','C' ,'Ce','Cs','Cl','Cr','Co','Cu','Cm','Dy','Es',
    ...
             isotopes[symbol] = elem_isotopes
         #end for
         self.isotopes = isotopes
        
+        # … build self.simple_elements and self.elements exactly as before …
+        # but remove the hacky `self[elem.symbol] = element` and instead:
+        self._base = self.elements   # real symbols → Element objects

     #end def __init__
    ...
 #end class PeriodicTable

With this change, there’s no need to edit the huge dictionary every time you add a new ghost atom; the ghost atom prefix can also be changed.

5.2 Propagate ghosts in physical_system.py

This time PhysicalSystem shows this error:

  PhysicalSystem error:
    particle X-H is unknown
  exiting.

In physical_system.py, edit the PhysicalSystem class:

     def add_particles(self, **particle_counts):
         pc = self.particle_collection  # all known particles
         plist = []
         for name,count in particle_counts.items():
+            # detect ghost-prefix (e.g. "X-" or your chosen prefix)
+            ghost = False
+            if name.startswith("X-"):
+                ghost = True
+                lookup_name = name[2:]      # strip "X-"
+            else:
+                lookup_name = name
             particle = pc.get_particle(lookup_name)
             if particle is None:
                 self.error('particle {0} is unknown'.format(name))
             else:
                 particle = particle.copy()
+                setattr(particle, "ghost", ghost) # mark it ghost for downstream code
             #end if
             particle.set_count(count)
             plist.append(particle)
         #end for
         self.particles.add_particles(plist)
         self.update()
     #end def add_particles

This will:

  1. Checks if your incoming name starts with "X-".
  2. Strips off the prefix before calling get_particle.
  3. Copies the real particle, then sets particle.ghost=True.

5.3 Handle ghost species in qmcpack_input.py

Now the error is more specific:

You said:
Traceback (most recent call last):
  File "/home/s2420025/Volumes/storage/hdd4/reza/06ghost/12test/02dmc_monoA/script.py", line 129, in <module>
    optJ2 = generate_qmcpack(
  File "/home/s2420025/app/qmcpack-3.17.1/nexus/lib/qmcpack.py", line 1591, in generate_qmcpack
    sim_args.input = generate_qmcpack_input(**inp_args)
  File "/home/s2420025/app/qmcpack-3.17.1/nexus/lib/qmcpack_input.py", line 7375, in generate_qmcpack_input
    inp = generate_basic_input(**kwargs)
  File "/home/s2420025/app/qmcpack-3.17.1/nexus/lib/qmcpack_input.py", line 7586, in generate_basic_input
    particlesets = generate_particlesets(
  File "/home/s2420025/app/qmcpack-3.17.1/nexus/lib/qmcpack_input.py", line 4566, in generate_particlesets
    ion = ions[ion_spec]
  File "/home/s2420025/app/qmcpack-3.17.1/nexus/lib/generic.py", line 158, in __getitem__
    return self.__dict__[name]
KeyError: 'X-C'

I proceeded to edit generate_particlesets in qmcpack_input.py:

def generate_particlesets(electrons   = 'e',
                          ions        = 'ion0',
                          up          = 'u',
    ...
         groups = []
+        ghost  = False
         for ion_spec in ion_species:
-            ion = ions[ion_spec]
+            # strip off ghost-prefix if present
+            ghost = False
+            key = ion_spec
+            if key.startswith("X-"):
+                ghost = True
+                key = key[2:]
+            # now look up the real ion
+            base_ion = ions[key]
+            ion = base_ion.copy()    # clone so we don’t mutate the original
+            ion.ghost = ghost        # mark it ghost if needed
             gpos = pos[elem==ion.name]
             g = group(
                 name         = ion.name,
    ...

After this, when ion_spec is "X-C", it will strip "X-", look up "C", copy it, flag it as a ghost, and carry on.

6. Remaining manual step.

With these changes in place, NEXUS will recognize and propagate ghost-atom species through almost every stage of the QMC workflow, from species creation and input file generation to Jastrow setup and the full DMC run. However, Even after these changes, NEXUS stop right after convert4qmc and you must still manually edit .structure.xml and .qmc.in. Automating that final patch would close the loop.

7. Example calculation: BSSE evaluation on CH4⋯⋯C2H4 with increasing basis set

Figure 1 shows the convergence of the DMC binding energy for the CH4⋯⋯C2H4 van der Waals complex toward the CBS limit (computed with aug-cc-pV6Z) as we increase the augmented correlation-consistent basis set. For each basis set, aug-cc-pVDZ and aug-cc-pVTZ, we plot both the raw DMC bonding energy (no CP) and the counterpoise corrected result (CP). Error bars are one-sigma QMC uncertainties. The plain binding energy at aug-cc-pVDZ underbinds by approximately 0.03 eV; applying the CP correction recovers nearly the CBS value (within approximately 0.01 eV). By aug-cc-pVTZ the uncorrected error is already around 0.02 eV.

Raw calculation data for this example is here.

Image

rezapputra avatar Jul 10 '25 11:07 rezapputra

Some understanding of ghost atoms was added to Nexus a long time ago: https://github.com/QMCPACK/qmcpack/pull/1653

To what degree does this address/not address your current use case?

jtkrogel avatar Jul 16 '25 20:07 jtkrogel

@rezapputra Please reply when you get a chance.

jtkrogel avatar Jul 28 '25 13:07 jtkrogel

Sorry for the late reply.

I just tried it.

to use ghost_atoms, I need to add it in nexus.py. It was not added originally.

This only resolves the error:

  PhysicalSystem error:
    particle X-C is unknown
  exiting.

but raises another problem similar to Section 5.3 here:

Traceback (most recent call last):
  File "/home/s2420025/Volumes/storage/hdd4/reza/06ghost/12test/02dmc_monoA_test/script.py", line 131, in <module>
    optJ2 = generate_qmcpack(
  File "/home/s2420025/app/qmcpack-3.17.1/nexus/lib/qmcpack.py", line 1591, in generate_qmcpack
    sim_args.input = generate_qmcpack_input(**inp_args)
  File "/home/s2420025/app/qmcpack-3.17.1/nexus/lib/qmcpack_input.py", line 7375, in generate_qmcpack_input
    inp = generate_basic_input(**kwargs)
  File "/home/s2420025/app/qmcpack-3.17.1/nexus/lib/qmcpack_input.py", line 7586, in generate_basic_input
    particlesets = generate_particlesets(
  File "/home/s2420025/app/qmcpack-3.17.1/nexus/lib/qmcpack_input.py", line 4566, in generate_particlesets
    ion = ions[ion_spec]
  File "/home/s2420025/app/qmcpack-3.17.1/nexus/lib/generic.py", line 158, in __getitem__
    return self.__dict__[name]
KeyError: 'X-C'

To answer your question:

To what degree does this address/not address your current use case? It might not do much other than resolving my problem on section 5.1.

  • Before I did this, I restored the Nexus library to the original.

rezapputra avatar Jul 28 '25 14:07 rezapputra