charm icon indicating copy to clipboard operation
charm copied to clipboard

Cut long-running tests & examples out of 'make test'

Open PhilMiller opened this issue 10 years ago • 9 comments

Original issue: https://charm.cs.illinois.edu/redmine/issues/595


Identify tests/examples that take substantial time to run in the current make test. Either find parameters that make them run faster, or remove them from make test if they don't exercise features not covered by other tests. If necessary, add a separate make perftest target that runs them with parameters that would generate meaningful benchmark numbers.

Screen Shot 2014-11-03 at 11 30 59 PM

PhilMiller avatar Oct 28 '14 21:10 PhilMiller

Original date: 2014-11-04 05:15:49


Running all the test only takes 80s. Here is the time break down for each directory: AMPI 8 Charm++ 37 Converse 34 FEM 0.2 Util 0.02

However, there are tests in the test directory but never be run by auto build. Some of them even crash. Attached is the file of the list along with the name of the person that either wrote it or made last significant modifications.

AMPI

    chkpt	            fails		                Gengbin
fallreduce	            compile fails		Gengbin
jacobi3d 	            no test target		Esteban
mpich-test 	    no test target		
speed	            works!		                Gengbin

Charm++

    kNeighbor	    no test target		Chao
array4D 	            crash             		Abhinav
pmetest	            compile fails		Sameer
arrayPerf	            works!		                Sameer
python	            compile fails		Filippo
topology	            works!		                Abhinav
broadcast	    works!		                Phil
io	                    no test target		Phil
reductiontesting  no test target		Rahul
jacobi3d	            no test target	        used for fttest	
commSpeed	    works!		                Terry
jacobi3d-gauss   no test target		Yanhua
penciltest	            compile fails		Sameer
commtest	    crash		                Abhishek
jacobi-sdag	    no test target	        used for fttest	
ping	                    crash		                Yanhua
startuptest	    compile fails		Eric
hello-crosscorruption    hangs		

Note: for formated version please look at the attached file.

xiangni avatar Apr 24 '19 22:04 xiangni

Original date: 2014-11-04 05:31:31


Running all the test only takes 80s. Here is the time break down for each directory: AMPI 8 Charm++ 37 Converse 34 FEM 0.2 Util 0.02

However, there are tests in the test directory but never be run by auto build. Some of them even crash. Here is the list along with the name of the person that either wrote it or made last significant modifications. AMPI chkpt fails Gengbin fallreduce compile fails Gengbin jacobi3d no test target Esteban mpich-test no test target speed works! Gengbin Charm++ kNeighbor no test target Chao array4D crash Abhinav pmetest compile fails Sameer arrayPerf works! Sameer python compile fails Filippo topology works! Abhinav broadcast works! Phil io no test target Phil reductiontesting no test target Rahul jacobi3d no test target used for fttest commSpeed works! Terry jacobi3d-gauss no test target Yanhua penciltest compile fails Sameer commtest crash Abhishek jacobi-sdag no test target used for fttest ping crash Yanhua startuptest compile fails Eric hello-crosscorruption hangs

xiangni avatar Apr 24 '19 22:04 xiangni

Original date: 2014-11-04 20:21:16


Within tests/charm++ and tests/converse, there seem to be just a few things that make up the bulk of the 30 seconds each, and that add little in the way of correctness testing. The most noticeable is tests/charm++/queue/msgtest, with pgm in the same directory also being long-ish. tests/charm++/communication_overhead is similar. It's worth testing that the runtime doesn't fall over with long queues or large messages, but we don't need to hammer on them for nearly so long. charm++/taskSpawn{,Recursive} take 8 and 4 seconds, respectively, when their parameters could be reduced to run shorter.

PhilMiller avatar Apr 24 '19 22:04 PhilMiller

Original date: 2014-11-04 21:49:25


Total time to run examples is 46s charm++: 13s converse: 0.317s ampi: 32s armci: 1.1s

xiangni avatar Apr 24 '19 22:04 xiangni

Original date: 2016-03-08 22:13:41


On my lab machine with a 'netlrts-linux-x86_64 --with-production' build, the total time to run 'tests' is 71s: charm++: 39s converse: 22s ampi: 9s fem: 0.2s util: 0.01s

The total time to run 'examples' is 31s: charm++: 28s converse: 0.1s ampi: 2s armci: 0.7s

Are these runtimes acceptable or not? Also, the perftest/ directory doesn't look like it does anything at all ... should we remove it?

stwhite91 avatar Apr 24 '19 22:04 stwhite91

Original date: 2017-04-13 21:04:11


Here is an individual test time breakdown for make test netlrts-linux-x86_64 smp --with-production on intellect https://docs.google.com/a/illinois.edu/spreadsheets/d/1xpCPCNWrrEIZmHGtAC_K1bkor2SOIDuxhp7kkbZq0x8/edit?usp=sharing

359s total time. However the wallclock time is somewhat larger as only the time of each testrun execution is recorded.

Only 5 tests require more than 10s.

Test time(s) commbench/pgm 86.007 machinetest/multiping 19.88 megampi/pgm 16.885 Cjacobi3D/jacobi 14.72 kNeighbor/kNeighbor 10.774

So we have one major issue in that commbench can take long time. Then we have a minor issue in that each test takes over a second, so having over ninety of them means there is another minute and half minimum.

ericjbohm avatar Apr 24 '19 22:04 ericjbohm

Original date: 2017-04-13 21:15:01


We want megampi/pgm and Cjacobi3D/jacobi to run for more than a few seconds to test AMPI messaging/collectives/migration all in one go.

I think commbench and multiping can be cut down in runtime a bit...

stwhite91 avatar Apr 24 '19 22:04 stwhite91

Original date: 2017-04-13 22:08:46


The two commbench tests which take the longest are pingpong and flood. Each of these has entirely hard coded parameters. The iteration counters hard coded in to these tests appear to massively overkill the number required for a reasonable accuracy.

Reducing them by an order magnitude (or a factor of 2 for relatively small <1e2) cases cuts runtime down by a factor of 3.

ericjbohm avatar Apr 24 '19 22:04 ericjbohm

megacon, Converse pingpong and pingpong_multipairs, examples/charm++/load_balancing/kNeighbor, and benchmarks/charm++/kNeighbor are tests that take more than a minute on our MPI Linux SMP CI.

evan-charmworks avatar Aug 13 '20 19:08 evan-charmworks