easybuild-easyconfigs icon indicating copy to clipboard operation
easybuild-easyconfigs copied to clipboard

{bio}[foss/2023a] GROMACS v2023.4 w/ CUDA 12.1.1

Open krtzr opened this issue 11 months ago • 8 comments

(created using eb --new-pr)

krtzr avatar Mar 19 '24 15:03 krtzr

@boegelbot please test @ generoso

casparvl avatar Mar 19 '24 21:03 casparvl

@casparvl: Request for testing this PR well received on login1

PR test command 'EB_PR=20154 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs /opt/software/slurm/bin/sbatch --job-name test_PR_20154 --ntasks=4 ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 13146

Test results coming soon (I hope)...

- notification for comment with ID 2008187035 processed

Message to humans: this is just bookkeeping information for me, it is of no use to you (unless you think I have a bug, which I don't).

boegelbot avatar Mar 19 '24 21:03 boegelbot

Test report by @boegelbot FAILED Build succeeded for 0 out of 1 (1 easyconfigs in total) cns1 - Linux Rocky Linux 8.9, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8 See https://gist.github.com/boegelbot/656cfdb22ab59acc6e89be3fe7b108e5 for a full test report.

boegelbot avatar Mar 19 '24 22:03 boegelbot

Test report by @casparvl FAILED Build succeeded for 0 out of 1 (1 easyconfigs in total) tcn1.local.snellius.surf.nl - Linux RHEL 8.6, x86_64, AMD EPYC 7H12 64-Core Processor, Python 3.6.8 See https://gist.github.com/casparvl/4962a549c4e108a7f54e8aa50442ac53 for a full test report.

casparvl avatar Mar 19 '24 23:03 casparvl

Test report by @Micket FAILED Build succeeded for 0 out of 1 (1 easyconfigs in total) vera-gpu1 - Linux Rocky Linux 8.9, x86_64, Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz, 2 x NVIDIA Tesla V100-SXM2-32GB, 550.54.14, Python 3.6.8 See https://gist.github.com/Micket/a48d30dff7a87ee0ba871a860cc6c7c1 for a full test report.

Micket avatar Mar 20 '24 10:03 Micket

I think on Generoso, we are running into the error that was also encountered in the non-CUDA version https://github.com/easybuilders/easybuild-easyconfigs/pull/19728

The errors that @Micket and I ran into seem to be different, but I don't really understand what the actual error is here:

Detailed error:

fh = <_io.TextIOWrapper name='/gpfs/nvme1/1/casparl/ebtmpdir/eb-z4p_t53k/pytest-of-casparl/pytest-2/spc_water_box0/topology.top' mode='w' encoding='UTF-8'> gmxapi = <module 'gmxapi' from '/gpfs/nvme1/1/casparl/ebbuildpath/GROMACS/2023.4/foss-2023a-CUDA-12.1.1/easybuild_obj/python_packaging/gmxapi/gmxapi_staging/gmxapi/init.py'> gmxcli = PosixPath('/gpfs/nvme1/1/casparl/ebbuildpath/GROMACS/2023.4/foss-2023a-CUDA-12.1.1/easybuild_obj/bin/gmx_mpi') scoped_chdir = <function scoped_chdir at 0x153a88f87740> solvate = <OperationHandle (<ResourceManager gmxapi.commandline.commandline_operation..merged_ops0_i0: width=1, director...at all other processes were killed!\n']>, stdout: <OutputData(ResultDescription(dtype=str, width=1)) "stdout": ['']>>)> structurefile = '/gpfs/nvme1/1/casparl/ebtmpdir/eb-z4p_t53k/pytest-of-casparl/pytest-2/spc_water_box0/structure.gro' tempdir = PosixPath('/gpfs/nvme1/1/casparl/ebtmpdir/eb-z4p_t53k/pytest-of-casparl/pytest-2/spc_water_box0') testdata = {'datasource': {'solvent_structure': 'src/testutils/simulationdatabase/spc216.gro', 'solvent_topology': 'src/testutils... "oplsaa.ff/forcefield.itp"', '', '; Include water topology', '#include "oplsaa.ff/tip3p.itp"', '', '[ system ]', ...]} testdir = '/gpfs/nvme1/1/casparl/ebbuildpath/GROMACS/2023.4/foss-2023a-CUDA-12.1.1/gromacs-2023.4/python_packaging/gmxapi/test' tmp_path_factory = TempPathFactory(_given_basetemp=None, _trace=<pluggy._tracing.TagTracerSub object at 0x153a88faa050>, _basetemp=PosixPath('/gpfs/nvme1/1/casparl/ebtmpdir/eb-z4p_t53k/pytest-of-casparl/pytest-2'), _retention_count=3, _retention_policy='all') top_input = ['#include "oplsaa.ff/forcefield.itp"', '', '; Include water topology', '#include "oplsaa.ff/tip3p.itp"', '', '[ system ]', ...] topfile = '/gpfs/nvme1/1/casparl/ebtmpdir/eb-z4p_t53k/pytest-of-casparl/pytest-2/spc_water_box0/topology.top'

../../../../gromacs-2023.4/python_packaging/gmxapi/test/conftest.py:125: RuntimeError ____________________ ERROR at setup of test_write_tpr_file ____________________

gmxcli = PosixPath('/gpfs/nvme1/1/casparl/ebbuildpath/GROMACS/2023.4/foss-2023a-CUDA-12.1.1/easybuild_obj/bin/gmx_mpi') tmp_path_factory = TempPathFactory(_given_basetemp=None, _trace=<pluggy._tracing.TagTracerSub object at 0x153a88faa050>, _basetemp=PosixPath('/gpfs/nvme1/1/casparl/ebtmpdir/eb-z4p_t53k/pytest-of-casparl/pytest-2'), _retention_count=3, _retention_policy='all')

@pytest.fixture(scope="session")
def spc_water_box_collection(gmxcli, tmp_path_factory):
    """Provide a collection of simulation input items for a simple simulation.

    Prepare the MD input in a freshly created working directory.
    Solvate a 5nm cubic box with spc water. Return a dictionary of the artifacts produced.
    """
    import gmxapi.runtime
    from gmxapi.testsupport import scoped_chdir

    # TODO: (#2896) Fetch MD input from package / library data.
    # Example:
    #     import pkg_resources
    #     # Note: importing pkg_resources means setuptools is required for running this test.
    #     # Get or build TPR file from data bundled via setup(package_data=...)
    #     # Ref https://setuptools.readthedocs.io/en/latest/setuptools.html#including-data-files
    #     from gmx.data import tprfilename

    with scoped_chdir(tmp_path_factory.mktemp("spc_water_box")) as tempdir:

        testdir = os.path.dirname(__file__)
        with open(os.path.join(testdir, "testdata.json"), "r") as fh:
            testdata = json.load(fh)

        # TODO: (#2756) Don't rely on so many automagical behaviors (as described in comments below)

        structurefile = os.path.join(tempdir, "structure.gro")
        # We let `gmx solvate` use the default solvent. Otherwise, we would do
        #     gro_input = testdata['solvent_structure']
        #     with open(structurefile, 'w') as fh:
        #         fh.write('\n'.join(gro_input))
        #         fh.write('\n')

        topfile = os.path.join(tempdir, "topology.top")
        top_input = testdata["solvent_topology"]
        # `gmx solvate` will append a line to the provided file with the molecule count,
        # so we strip the last line from the input topology.
        with open(topfile, "w") as fh:
            fh.write("\n".join(top_input[:-1]))
            fh.write("\n")

        assert os.path.exists(topfile)
        solvate = gmxapi.commandline_operation(
            gmxcli,
            arguments=["solvate", "-box", "5", "5", "5"],
            # We use the default solvent instead of specifying one.
            # input_files={'-cs': structurefile},
            output_files={
                "-p": topfile,
                "-o": structurefile,
            },
        )
        assert os.path.exists(topfile)

        if solvate.output.returncode.result() != 0:
            logging.debug(solvate.output.stderr.result())
          raise RuntimeError("solvate failed in spc_water_box testing fixture.")

E RuntimeError: solvate failed in spc_water_box testing fixture.

casparvl avatar Mar 20 '24 12:03 casparvl

Test report by @casparvl FAILED Build succeeded for 0 out of 1 (1 easyconfigs in total) tcn1.local.snellius.surf.nl - Linux RHEL 8.6, x86_64, AMD EPYC 7H12 64-Core Processor, Python 3.6.8 See https://gist.github.com/casparvl/194f34b1565c618fa46d1f99bc2dee52 for a full test report.

casparvl avatar Mar 20 '24 12:03 casparvl

Hm, I figured I'd try building on a different filesystem to see if that made a difference. But since these are diskless nodes, I tried /dev/shm. Not any better, it actually fails earlier... :\

casparvl avatar Mar 20 '24 12:03 casparvl

What is the status here? - All checks seem to have passed.

krtzr avatar Apr 11 '24 15:04 krtzr