qmcpack icon indicating copy to clipboard operation
qmcpack copied to clipboard

nexus rmg updates

Open kgasperich opened this issue 2 years ago • 10 comments

Please review the developer documentation on the wiki of this project that contains help and requirements.

Proposed changes

  • [x] add RMG to ppset known codes
  • [X] drive RMG NSCF workflow with Nexus
  • [X] drive SD-DMC calculation workflow with Nexus starting with RMG orbitals

What type(s) of changes does this code introduce?

Delete the items that do not apply

  • New feature

Does this introduce a breaking change?

  • No

What systems has this change been tested on?

  • Laptop

Checklist

Update the following with a yes where the items apply. If you're unsure about any of them, don't hesitate to ask. This is simply a reminder of what we are going to look for before merging your code.

  • Yes. This PR is up to date with current the current state of 'develop'
  • No. Code added or changed in the PR has been clang-formatted
  • Yes. This PR adds tests to cover any new code
  • Yes. Documentation has been added (if appropriate) : Full nexus example.

kgasperich avatar Sep 26 '23 22:09 kgasperich

@anbenali what is still needed to move this out of WIP for review?

jtkrogel avatar Oct 10 '23 13:10 jtkrogel

I want to have it working with the MSD route as well. However, we can still just create another PR.

anbenali avatar Oct 10 '23 13:10 anbenali

Better to keep things moving imo.

prckent avatar Oct 10 '23 15:10 prckent

Checking on the status of this. It still has [WIP] but does anything still need to be done besides solving the conflict on qmcpack.py?

prckent avatar May 15 '24 23:05 prckent

@anbenali @kgasperich ping

jtkrogel avatar May 17 '24 13:05 jtkrogel

Ping. Can you remove the wip or is another update needed?

prckent avatar Jun 26 '24 17:06 prckent

This works on my desktop. Ready to merge when it passes the tests here

anbenali avatar Oct 18 '24 15:10 anbenali

After the meeting this week I'll give this a by-hand test drive -- we don't run these examples in CI (yet).

It looks like any recent version of RMG should be OK with this.

prckent avatar Oct 21 '24 14:10 prckent

I built a fresh QMCPACK + NEXUS + RMG environment, but could not get the example workflow to complete. Please can you recheck? It looks like there might be a labeling or file copying issue. If you run the workflow with no preexisting files, does it work? . While the existing 01 example runs fine, the 02 full example here can't find the wave.out.h5 file. I am using the last full release of RMG, v6.1.0:

  starting runs:
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
  elapsed time 0.0 s  memory 122.93 MB 
    Entering ./runs/RMG/scf 0 
      writing input files  0 scf 
    Entering ./runs/RMG/scf 0 
      sending required files  0 scf 
      submitting job  0 scf 
    Entering ./runs/RMG/scf 0 
      Executing:  
        export OMP_NUM_THREADS=1
        mpirun -np 1 rmg-cpu scf.in 

  elapsed time 5.0 s  memory 385.74 MB 
  elapsed time 10.0 s  memory 743.96 MB 
  elapsed time 15.1 s  memory 619.27 MB 
  elapsed time 20.1 s  memory 643.58 MB 
  elapsed time 25.1 s  memory 122.93 MB 
    Entering ./runs/RMG/scf 0 
      copying results  0 scf 
    Entering ./runs/RMG/scf 0 
      analyzing  0 scf 

  elapsed time 30.1 s  memory 122.93 MB 
    Entering ./runs/RMG/nscf-3x3x3 1 
      writing input files  1 nscf 
    Entering ./runs/RMG/nscf-3x3x3 1 
      sending required files  1 nscf 
      submitting job  1 nscf 
    Entering ./runs/RMG/nscf-3x3x3 1 
      Executing:  
        export OMP_NUM_THREADS=1
        mpirun -np 1 rmg-cpu nscf.in 

  elapsed time 35.2 s  memory 333.10 MB 
  elapsed time 40.2 s  memory 621.51 MB 
  elapsed time 45.2 s  memory 122.93 MB 
    Entering ./runs/RMG/nscf-3x3x3 1 
      copying results  1 nscf 
    Entering ./runs/RMG/nscf-3x3x3 1 
      analyzing  1 nscf 

  elapsed time 50.3 s  memory 122.93 MB 

  Qmcpack error:
    wavefunction file not found:
    ./runs/RMG/nscf-3x3x3/Waves/wave.out.h5
  exiting.

  File "/home/pk7/projects/qmc/git_QMCPACK_prckent/qmcpack/nexus/examples/rmg/02_diamond_scf_nscf_optJ123_dmc/./Diamond_full.py", line 219, in <module>
    run_project()
  File "/home/pk7/projects/qmc/git_QMCPACK_prckent/qmcpack/nexus/lib/nexus.py", line 563, in run_project
    pm.run_project()
  File "/home/pk7/projects/qmc/git_QMCPACK_prckent/qmcpack/nexus/lib/project_manager.py", line 101, in run_project
    self.progress_cascades()
  File "/home/pk7/projects/qmc/git_QMCPACK_prckent/qmcpack/nexus/lib/project_manager.py", line 334, in progress_cascades
    cascade.progress()
  File "/home/pk7/projects/qmc/git_QMCPACK_prckent/qmcpack/nexus/lib/simulation.py", line 1291, in progress
    sim.progress(self.simid)
  File "/home/pk7/projects/qmc/git_QMCPACK_prckent/qmcpack/nexus/lib/simulation.py", line 1291, in progress
    sim.progress(self.simid)
  File "/home/pk7/projects/qmc/git_QMCPACK_prckent/qmcpack/nexus/lib/simulation.py", line 1245, in progress
    self.get_dependencies()
  File "/home/pk7/projects/qmc/git_QMCPACK_prckent/qmcpack/nexus/lib/simulation.py", line 854, in get_dependencies
    self.incorporate_result(result_name,result,sim)
  File "/home/pk7/projects/qmc/git_QMCPACK_prckent/qmcpack/nexus/lib/qmcpack.py", line 759, in incorporate_result
    self.error('wavefunction file not found:\n'+h5file)
  File "/home/pk7/projects/qmc/git_QMCPACK_prckent/qmcpack/nexus/lib/generic.py", line 503, in error
    error(message,header,exit,trace,logfile=self._logfile)
  File "/home/pk7/projects/qmc/git_QMCPACK_prckent/qmcpack/nexus/lib/generic.py", line 138, in error
    traceback.print_stack()

Meanwhile:

$ find . -name "wave.out.h5" -exec ls -l {} \;
-rw-r--r-- 1 *** ***** 24585952 Oct 29 09:56 ./runs/RMG/scf/Waves/wave.out.h5

The scf looks OK:

$ find . -name "*.out" -exec tail -n 5 {} \;
 quench: [md:   0/100  scf:   4/100  step time:   0.76  scf time:     4.01 secs  RMS[dV]: 1.36e-04 ]
 quench: [md:   0/100  scf:   5/100  step time:   0.55  scf time:     4.57 secs  RMS[dV]: 1.24e-04 ]
 quench: [md:   0/100  scf:   6/100  step time:   0.49  scf time:     5.05 secs  RMS[dV]: 9.64e-05 ]
 quench: [md:   0/100  scf:   7/100  step time:   0.73  scf time:     5.78 secs  RMS[dV]: 6.80e-06 ]
 Convergence criterion reached: Energy variation (9.00e-10) is lower than threshold (1.00e-09)

but the nscf step simply can't read the file, since it isn't there:

Terminating. at LINE 91 in /scratch/pk7/spack_build_stage/spack-stage-rmgdft-6.1.0-vlcvr2bxjt2nav642q75cvarslrsrkrm/spack-src/RMG/Common/ReadData.cpp.

prckent avatar Oct 29 '24 14:10 prckent

This might just need a little tuning up. It looks mostly fine and I would like it to end up going in.

jtkrogel avatar Nov 13 '25 20:11 jtkrogel