QCEngine icon indicating copy to clipboard operation
QCEngine copied to clipboard

OpenMPI build of NWChem is not called correctly

Open jlheflin opened this issue 6 months ago • 2 comments

Describe the bug When using QCEngine to run NWChem, if the executed NWChem version was build with OpenMPI, QCEngine is unable to do so due to the underlying executable needed to be run with mpirun.

To Reproduce

Have a version of NWChem that was compiled with OpenMPI:

  • Homebrew
  • Nix
  • Ubuntu via the nwchem-openmpi or nwchem-mpich packages
  • Fedora via the nwchem-openmpi or nwchem-mpich packages

Run the following python code:

import qcengine as qcng

import qcelemental as qcel

mol = qcel.models.Molecule.from_data("""
    O  0.0  0.000  -0.129
    H  0.0 -1.494  1.027
    H  0.0  1.494  1.027
""")

model = qcel.models.AtomicInput(
    molecule=mol,
    driver="energy",
    model={"method": "SCF", "basis": "sto-3g"}
)

ret = qcng.compute(model, "nwchem")

Expected behavior Expected behavior is no error return

Additional context I understand that there is the ability to utilize task_config to provide an mpiexec_command but I am unable to do so in a way that works. I am able to bypass this error by utilizing an executable bash script named "nwchem" in my PATH:

#!/bin/bash
REAL_NWCHEM="/home/jacob/.nix-profile/bin/nwchem"
if [[ "$0" == "$REAL_NWCHEM" ]]; then
    echo "Error: Wrapper is calling itself!" >&2
    exit 1
fi
exec mpirun "$REAL_NWCHEM" "$@"

jlheflin avatar Jun 08 '25 20:06 jlheflin

Hi, my first suspicion was that nwchem wasn't configured for mpi like mrchem is, but it looks like it is. Are you passing something like https://github.com/MolSSI/QCEngine/blob/master/qcengine/tests/test_config.py#L207 in your run like qcng.compute(atin, "nwchem", task_config={"mpiexec_command": ...})?

loriab avatar Jun 16 '25 16:06 loriab

Hi! Yes, that is similar to what I was trying to run. I just tested this task_config:

local_opts = {
"nnodes": 1,
"ncores": 4,
"mpiexec_command": "mpirun -n {total_ranks}",
}

I'm not fully sure on the available options but no matter what I usually get the following error:

In [12]: print(ret.error.error_message)
QCEngine Unknown Error: INPUT:
echo
geometry units bohr
O                     0.000000000000     0.000000000000    -0.243774670000
H                     0.000000000000    -2.823250830000     1.940748730000
H                     0.000000000000     2.823250830000     1.940748730000

end
memory 1290235019 double
charge 0

basis 
  H library sto-3g
  bqH library H sto-3g
  O library sto-3g
  bqO library O sto-3g
end

task scf energy

STDOUT:

STDERR:
[0] Received an Error in Communication: (2) ranks per node, must be at least
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI COMM 3 DUP FROM 0
  Proc: [[45766,0],0]
  Errorcode: 2

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------

TRACEBACK:
NoneType: None

If I don't use the script above.

jlheflin avatar Jun 16 '25 17:06 jlheflin