Albany LandIce: remove FO_AIS tests from debug nightlies

I'd like to remove the FO_AIS tests from debug builds. These tests show quite a bit of variability even in opt mode (ranging from 2m to 8m even), so they can timeout in debug quite easily, causing failure emails to be sent out to those who own/watch the builds on cdash. Alternatively, we could increase the timeout in debug builds. However, I'm not sure if the AIS tests are testing features not already tested. If the point is to test the ML/MueLu preconditioners, we could probably use those in some other smaller test (like FO_MMS or some of the FO_GIS ones).

@ikalash @mperego what do you think?

Jul 01 '19 21:07 bartgol

I'm fine with this. Just FYI, I think the FO_AIS tests do not always time out in the debug builds - on the CEE ones, it depends on what else is running on a given compute node.

In terms of what the FO_AIS tests are testing that is unique - it is the special ML/MueLu preconditioner based on semi-coarsening that Ray designed. You really need this preconditioner for AIS problems, therefore just enabling it for another problem will not accomplish the same thing. We have identified bugs in ML/MueLu that crept in through the FO_AIS tests, so they are important.

I would actually vote for keeping the FO_AIS tests in some of the debug builds, in particular, the debug build on my machine camobap. Since there isn't ever anything other than the nightlies running on it at night, the FO_AIS tests shouldn't time out randomly on some nights like on the CEE. A way to accomplish this would be to add a flag to turn on AIS or not. That may not seem ideal but it's better than having logic to identify if a build is debug, and turn on FO_AIS for that build, as I could see someone might want it on in a given debug build if he/she wanted to debug something pertaining to FO_AIS. I'm not sure if this is how you planned to disable the tests in some of the builds @bartgol , or you had some other approach in mind.

Jul 01 '19 21:07 ikalash

I'm ok with pretty much any strategy that would prevent random failures due to timeout, when the reason for the time out is not on the albany side. Disabling a single test requires some extra cmake logic, which is in a way polluting. We could move the AIS tests into the large tests folder, which is not enabled by default (@ikalash not sure if you do enable it in all your nightlies anyways though).

Jul 01 '19 21:07 bartgol

I didn't even know you could turn large tests off. The issue in turning off the large tests in debug builds is there is stuff there that needs to be tested. For example, all our Schwarz stuff is in the large tests irectory, and I want that tested in the debug nightlies. I suppose a lot of the "real" Schwarz testing could actually go in the small tests directory, as I think there are a bunch of small tests, they just happened to go in the large directory. Aeras also is in the large directory.

This brings up the question of how to define "large". Initially I think the large directory was intended for tests that have large exodus files, for the purpose of removing these files from the main Albany repo. I'm not sure if that'll ever happen at this stage, to be honest.

One could also conceive of "long" tests, which we currently do not have a concept of I think (though some of the "large" tests also take awhile). One could I suppose add the option to run "long" tests and move some of the tests like the AIS ones there, as well as other tests that sometimes time out.

Jul 01 '19 22:07 ikalash

I actually thought that large tests meant also long. Large exodus files should be a good proxy for 'long test', either because of large mesh or large number of timesteps. I guess the exception would be "large number of fields saved".

I get your desire to test everything in all builds, but I think debug builds should not run long tests. A debug build is meant to catch semantic errors. You can (or, better, should) be able to do that with small and fast tests. Whether your solver/preconditioner/whatever does work or not is already checked by release builds. Also, if the large test does not pass, then a debug build would a) not help at all (since the problem is algorithmic, not a c++ bug), b) help regardless of the size of the problem (e.g., you read out of bounds).

Jul 02 '19 00:07 bartgol

@bartgol : that makes sense. I think the broader issue I'm bringing up is the tests should probably be cleaned up somewhat, as I know that a lot of the large tests aren't large. For instance, Schwarz has some large tests with large-ish methods but I don't believe any of those run nightly - they are just there in the repo. Same goes for Aeras - Aeras was classified as "large" b/c there are some large mesh files, but many tests don't use them.

I think the tests that are run across a wide range of platforms need to cover the physics we want to test, which would not be the case were we to turn off "large" tests in some builds simply b/c the tests were mis-classified. I'm happy to try to rearrange the Schwarz and Aeras ones, if someone (@ibaned is likely the right person) what was the classification of a "large" test - I am pretty sure it was based on directory size. @lxmota I'd need to rely on you to clean up the large LCM tests, but I suspect it's OK to keep those all in the large dir, as I think a lot of those do not run, and those that do, are not that much of interest these days.

Jul 02 '19 00:07 ikalash

The organization of tests into a "large" directory was based entirely on an effort to reduce the size of the Albany source repository. I wanted to move all tests to their own repository but @agsalin insisted that some tests be left in the source repository, hence the need to separate them between two folders. To the extent that we care about clone times and disk space for the Albany source code, we should take care not to check large amounts of test data into it.

Jul 02 '19 02:07 ibaned

I'm all for keeping the (storage) size of tests small. Personally, I think we should not even have 10k+ DOFs tests, and let each project (LandIce, LCM, Aeras, whatnot) have their own dashboard where they check the problem to whatever degree of intensity they choose, leaving the shared and common repo to a squeaky clean and fast state.

That said, I understand my point of view is not probably shared by many, so I'm fine with some larger tests, both in terms of storage space and execution time. However, I think users should be able to 1) download a reasonable size repo (hence, not too many large tests, and not too large to start with), and 2) run a relatively comprehensive testsuite of all physics packages without time limits. Point 1) is achieved by limiting the number of large tests to a small set that really needs to be tested at scale (and we should think carefully whether we really need large scale to start with); we can also thinking about reviving the second repo idea, and confining all (possibly but a handful) of large tests to this separate repo. Point 2) is probably easier, and basically means we should sort tests appropriately in the small and large tests folders. We can discuss what shuold be the maximum time limit of a test to be placed in the small folder (I would vote for 60s or less, on an average serial run, but I know people will push against such restrictive measures), and then start doing this.

If there's no urge to change the current behavior, we could add this as an item for the Albany User Meeting in (probably) January.

Jul 02 '19 14:07 bartgol