ccpp-framework icon indicating copy to clipboard operation
ccpp-framework copied to clipboard

Testing refactor

Open mwaxmonsky opened this issue 1 year ago • 11 comments

Refactoring of testing infrastructure

Changes include:

  • Refactored python tests to leverage unittest framework.
  • Replaced shell tests with equivalent wrapper from Python.
  • Reduces value duplication from fortran/shell/python files to just fortran and python file.
  • Adds re-useable CMake infrastructure for capgen, advection, ddthost, and var_compatability tests.
  • Enabling re-use of openmp and mpi configuration for both library and tests.
  • Removing PGI compiler options and adding ifx support.
  • Replaces manually calling fortran tests through run_test with ctest
  • Removed a few un-used files
  • Adds missing dependencies and --exclude-required tests.

TODO:

  • [x] Convert Fortran tests to pFUnit tests
  • [x] Determine optimal layout of python direct/command line calls
  • [ ] ~~Determine if yaml or json is possible for Fortran to have single source for variable names and integrate into workflow~~ Will implement in a separate PR to reduce review.

User interface changes?: No

Fixes: None Resolves https://github.com/NCAR/ccpp-framework/issues/514

Testing: test removed: None unit tests: system tests: manual testing:

mwaxmonsky avatar Nov 11 '24 04:11 mwaxmonsky

Greetings all, apologies for throwing this out there without an issue but I've been trying to keep up with the changes in the framework and the equivalent tests and was getting lost in the details and noticed a few things that I thought would make things a little bit easier overall.

My main goal was making it more explicit what was being tested and reducing the amount of duplication we have but that necessitated quite a few changes at various levels of the infrastructure.

I started going down a rabbit hole of how the framework is built and tested but before going much further, I figured I would share some of the things I've updated and check in to see if these make sense and are useful for the rest of the team.

There are a few technical details that need to be evaluated as well but I wanted to check if, overall, this is moving things in a reasonable direction or if this is making things more complicated?

mwaxmonsky avatar Nov 11 '24 05:11 mwaxmonsky

Hi @mwaxmonsky, thanks for tackling this.

So far, most of this looks like much needed cleanup!

There is one change I do not understand. The old (clumsy, wordy) run_test script tested both the Python and the bash interfaces to ccpp_datafile.py. Do the new tests test the bash interface? These are (or at least were) used by some build systems.

gold2718 avatar Nov 11 '24 08:11 gold2718

Hi @mwaxmonsky, thanks for tackling this.

So far, most of this looks like much needed cleanup!

There is one change I do not understand. The old (clumsy, wordy) run_test script tested both the Python and the bash interfaces to ccpp_datafile.py. Do the new tests test the bash interface? These are (or at least were) used by some build systems.

@gold2718 Yup! I made sure to be as faithful to the current coverage and made each of the shell tests their own unit tests at the bottom of the python file using the subprocess.run(...) API. If I missed anything on that front, I'm happy to add and address anything that might be missing.

mwaxmonsky avatar Nov 11 '24 15:11 mwaxmonsky

There is a lot of great stuff in this PR, thanks @mwaxmonsky.

My main concern with this PR is that it touches code/build files that aren't just used by the unit tests, but also by production code (e.g. the top-level CMakeLists.txt file). For instance, this PR enfores -O0 and certain compiler flags for the CCPP framework. We definitely do not want -O0 for production environments. There must be different options to set compilers for unit testing and for host models like the UFS, CAM/SIMA, NEPTUNE, and there must be a way for the host model to define compiler flags that overwrite any defaults (if we want to set defaults for cases other than running the tests).

There are two comments below that need fixing.

We also need to consider targeting the main branch for this PR due to the impact on the host modeling systems. This PR may have to come straight after the current develop was merged into main. And it must be written in a way that it allows ccpp-prebuild to continue to work, otherwise this PR (or the develop branch, if this PR gets merged into develop) will be blocked until all host models have moved to capgen. We don't want this, it was a heavy lift last time to get feature/capgen being merged into main after a long time of parallel development.

@climbfuji Happy to remove the -O0 flag and the other flags (I meant to make those target specific but wanted to get feed back on everything else before going further with the build), that was just in there from debugging and we definitely don't want that in there for production environments.

In terms of options from different compilers/flags, I'll look into different APIs for cmake and see what makes the most sense.

As for targeting main, I'm happy to pivot there but the tests for prebuild run successfully the same as on main. Is the issue that users would have to update their model code and their build integration with the framework? Just trying to understand the issue of targeting develop instead of main.

mwaxmonsky avatar Nov 11 '24 23:11 mwaxmonsky

There is a lot of great stuff in this PR, thanks @mwaxmonsky. My main concern with this PR is that it touches code/build files that aren't just used by the unit tests, but also by production code (e.g. the top-level CMakeLists.txt file). For instance, this PR enfores -O0 and certain compiler flags for the CCPP framework. We definitely do not want -O0 for production environments. There must be different options to set compilers for unit testing and for host models like the UFS, CAM/SIMA, NEPTUNE, and there must be a way for the host model to define compiler flags that overwrite any defaults (if we want to set defaults for cases other than running the tests). There are two comments below that need fixing. We also need to consider targeting the main branch for this PR due to the impact on the host modeling systems. This PR may have to come straight after the current develop was merged into main. And it must be written in a way that it allows ccpp-prebuild to continue to work, otherwise this PR (or the develop branch, if this PR gets merged into develop) will be blocked until all host models have moved to capgen. We don't want this, it was a heavy lift last time to get feature/capgen being merged into main after a long time of parallel development.

@climbfuji Happy to remove the -O0 flag and the other flags (I meant to make those target specific but wanted to get feed back on everything else before going further with the build), that was just in there from debugging and we definitely don't want that in there for production environments.

In terms of options from different compilers/flags, I'll look into different APIs for cmake and see what makes the most sense.

As for targeting main, I'm happy to pivot there but the tests for prebuild run successfully the same as on main. Is the issue that users would have to update their model code and their build integration with the framework? Just trying to understand the issue of targeting develop instead of main.

Yes, if you change the top-level CMakeLists.txt, you will impact the host models that use the ccpp-framework. UFS, SCM and NEPTUNE all use the ccpp-framework CMakeLists.txt.

climbfuji avatar Nov 12 '24 03:11 climbfuji

Also, this might be better saved for a discussion thread but I was wondering about the references to MPI and OpenMP. I don't see any openmp pragmas or types and currently the only reference to MPI is the mpi_comm in ccpp_types and even then we only declare a public reference but never use it internally.

I see a note in common.py about MPI:

# Maximum number of concurrent CCPP instances per MPI task
CCPP_NUM_INSTANCES = 200

and this value is used in the write function in mkstatic but it doesn't look like this is directly tied to the framework being build with MPI support.

If we don't need MPI or openmp at build time for the framework, would it make sense removing these references in the CMake and ccpp_types or are these actively being used and I'm just missing a key detail?

mwaxmonsky avatar Nov 12 '24 15:11 mwaxmonsky

Currently, capgen generates omp_get_thread_num calls in the suite caps. I don't think there is any MPI code generated currently by capgen.

gold2718 avatar Nov 12 '24 19:11 gold2718

ccpp-prebuild doesn't use any OpenMP calls in the auto-generated caps (everything comes from the host model via ccpp_t and its components, but the auto-generated caps and the code in src/ccpp_types.F90 need MPI. Removing the MPI dependency may break host models currently using ccpp_prebuild.

climbfuji avatar Nov 12 '24 19:11 climbfuji

@climbfuji Is there a json or yaml library (or other data format library) for Fortran that is currently available on the Neptune side such as https://github.com/jacobwilliams/json-fortran? I can consolidate the two places we declare all of the variables in the tests into a single data file if so, otherwise, we will have to deal with two files that need to be updated in parallel (instead of 3 currently).

mwaxmonsky avatar Feb 01 '25 19:02 mwaxmonsky

@mwaxmonsky For a while, we had yafyaml in spack-stack, mainly for GEOS/MAPL. Right now, I believe it's only libyaml, a C library. But I have been playing with https://github.com/Nicholaswogan/fortran-yaml-c, which is a simple interface to libyaml. I would like to get this set up for NEPTUNE for several reasons. Also, ESMF has yaml reading capabilities.

As far as json goes, there is the nlohmann-json/nlohmann-json-schema-validator C interface in spack-stack. the json-Fortran library looks great, has very few dependencies, and comes with a cmake build system. Should be easy to add to spack if it's not already there. Having a json Fortran reader seems like a good idea to me in general.

climbfuji avatar Feb 03 '25 11:02 climbfuji

Hey all, apologies for the delay with this, this took far longer than I expected and there's still more to look into if we want to follow up with more refactoring so I'm cutting this one off here feature wise and we can come up with a game plan and priority for the items mentioned that I couldn't get to in follow up PRs.

Overall, I think this leads us in a good direction that leverages as much of Python and CMake that we can at the moment. I'll be following up later with a PR for the json/yaml integration and I have one in the pipeline for the doxygen CI as well.

There are a few high level questions we might want to clarify to ease future contributions such as:

  1. How do we feel about integrating pFUnit with the integration tests?
  2. Is there a preference for how we integration test scenarios that should fail (such as the one in the advection test). It's not an issue but it does feel inconsistent and it's not clear when we should and should not include these cases.
  3. What is the feel for testing command line tests with pytest compared to using shell scripts? I personally like the pytest approach as we can get reports and additional utilities without having to implement our own shell script framework but they do stand out compared to the regular unit tests. Though admittedly there might not be a clear answer here and it might be more preference than technical.

mwaxmonsky avatar Apr 12 '25 20:04 mwaxmonsky

Hey all, so after debugging the latest round of build failures, we're coming back to sequential builds failing on the intel compilers. For example, running:

cmake -S. -B./build -DCCPP_FRAMEWORK_ENABLE_TESTS=ON -DCMAKE_PREFIX_PATH=path/to/pfunit
cd build
make
ctest

builds and runs as expected using GNU but on Derecho for both ifort and ifx, we get the same build error after running make:

/glade/u/home/mwaxmonsky/source/ccpp-framework-mwaxmonsky/build/test/var_compatibility_test/var_compatibility_host_integration.inc(1): error #7977: The type of the function reference does not match the type of the function definition.   [TEST_VAR_COMPATIBILITY_INTEGRATION_SUITE]
call suite%addTest(test_var_compatibility_integration_suite())
^
/glade/u/home/mwaxmonsky/source/ccpp-framework-mwaxmonsky/build/test/var_compatibility_test/var_compatibility_host_integration_driver.F90(54): error #7002: Error in opening the compiled module file.  Check INCLUDE paths.   [LOADER]
   use loader
-------^
/glade/u/home/mwaxmonsky/source/ccpp-framework-mwaxmonsky/build/test/var_compatibility_test/var_compatibility_host_integration_driver.F90(82): warning #8889: Explicit interface or EXTERNAL declaration is required.   [FUNIT_MAIN]
   call funit_main(load_tests, extra_initialize, extra_finalize)
--------^
/glade/u/home/mwaxmonsky/source/ccpp-framework-mwaxmonsky/build/test/var_compatibility_test/var_compatibility_host_integration_driver.F90(82): error #6404: This name does not have a type, and must have an explicit type.   [LOAD_TESTS]
   call funit_main(load_tests, extra_initialize, extra_finalize)
-------------------^
compilation aborted for /glade/u/home/mwaxmonsky/source/ccpp-framework-mwaxmonsky/build/test/var_compatibility_test/var_compatibility_host_integration_driver.F90 (code 1)
make[2]: *** [test/var_compatibility_test/CMakeFiles/var_compatibility_host_integration.dir/build.make:105: test/var_compatibility_test/CMakeFiles/var_compatibility_host_integration.dir/var_compatibility_host_integration_driver.F90.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:433: test/var_compatibility_test/CMakeFiles/var_compatibility_host_integration.dir/all] Error 2

I've tried debugging this further but it gets challenging figuring out if the issue is truly a pfunit issue or possibly a cmake or even a make issue as the issue resolves if you use make -j3 or similar.

I can continue to debug this but seeing as this has been a long standing build issue, would this be worth continuing on with? I'm happy to do so but it doesn't look like a fix will be straight forward.

An alternative I could try is using our current implementation of tests (with the module/program in the same file) and try integrating into ctest natively and sidestep pfunit entirely (though there may be some work to make that work as well).

Any suggestions or preferences would be greatly appreciated!

mwaxmonsky avatar Jun 26 '25 19:06 mwaxmonsky

A quick follow up to my previous post:

The issue was able to be traced back to a longstanding ifort/ifx bug that was supposed to be fixed in the 2024 release (not sure why our builds on Derecho are experiencing this but it may have been only for ifx and even then may have regressed back in 2025.1?).

A quick fix was to change all -warn flags to -warn all,nointerfaces as suggested by a pre-existing pfunit github issue but long term, we'll have to work with the intel team to see if they re-introduced a regression.

I am happy to keep looking at ways to mitigate this if the -warn all flag is required for builds (though the solution might be a little more involved), so if there's any suggestions on how to best proceed, I'm all ears!

mwaxmonsky avatar Jun 27 '25 20:06 mwaxmonsky

@mwaxmonsky, as explained in the output of ./scripts/ccpp_capgen.py --help:

--debug               Add variable allocation checks to assist debugging

That is, extra code is generated in the caps to check the allocation status of certain variables.

It is used within the capgen Python code in the line, if self.__group.run_env.debug: in scripts/suite_objects.py

In the generated code, you will see sections within blocks:

         ! ##################################################################
         ! Begin debug tests
         ! ##################################################################
<variable checking code>
         ! ##################################################################
         ! End debug tests
         ! ##################################################################

gold2718 avatar Jul 03 '25 01:07 gold2718

@mwaxmonsky I see you made a few more changes in response to the approving reviews. Are there any more changes you planned to make or is it ready to merge?

mkavulich avatar Jul 14 '25 19:07 mkavulich

@mwaxmonsky I see you made a few more changes in response to the approving reviews. Are there any more changes you planned to make or is it ready to merge?

@mkavulich I'm happy with where this is and don't have any additional changes in mind so unless there are any last minute comments, I think we're good to go!

mwaxmonsky avatar Jul 14 '25 19:07 mwaxmonsky