sst-core icon indicating copy to clipboard operation
sst-core copied to clipboard

Run Time Measurements Not Synchronized

Open bhpayne opened this issue 1 year ago • 7 comments

When using print-timing-info, the "Run Loop Time" and "Build Time" can add up to a total time larger than the wall clock time.

This is because in start_simulation , start_build is set before the first barrier is called. This creates a situation where the first task is considered to be "simulating" while all other tasks are still building.

bhpayne avatar Oct 28 '23 23:10 bhpayne

Update: our team hasn't run tests to ensure this assessment is correct, so the following may be more precise:

This ~is~ could be because in start_simulation...

researcherben avatar Oct 31 '23 00:10 researcherben

In start_simulation , start_build is set before the first barrier is called. This creates a situation where the first task is considered to be "simulating" while all other tasks are still building.

Is that understanding of how things run correct? The assumption is that you can subtract the build time from the total time to get the run time. If there is some part of sst-core metrics that I don't understand then I could be wrong and this isn't a bug.

researcherben avatar Nov 03 '23 23:11 researcherben

The runtime timings in the core were put in to get an approximate idea of how the various phases of the runtime were split up. There is no guarantee that the numbers will add up exactly to the runtime. The new profiling points are there to provide more accurate profiling information. That being said, things could probably be improved with the timings provided with -v and --print-timing-info.

feldergast avatar Nov 06 '23 16:11 feldergast

@researcherben Is this still an issue?

jwilso avatar Mar 28 '24 16:03 jwilso

@jwilso I'll check and get back you next week.

researcherben avatar Mar 29 '24 10:03 researcherben

The breakdowns for timings in the code are as follows:

double start = sst_get_cpu_time();
...
double start_build = sst_get_cpu_time();
...
double start_run = sst_get_cpu_time();
info.build_time = start_run - start_build;
...
double end_time = sst_get_cpu_time();
info.run_time = end_time - start_run;
...
double total_end_time = sst_get_cpu_time();
double total_time = total_end_time - start;

Given that you're never going to get total_time = build_time + run_time even with 1 thread on 1 rank. When you throw in multiple ranks, threads it just grabs the max of each of those. max(run_time), max(total_time), max(build_time).

It may be a good idea to modify the output text somewhat to clear up any confusion.

bpswenson avatar Apr 02 '24 13:04 bpswenson