sst-core
sst-core copied to clipboard
Run Time Measurements Not Synchronized
When using print-timing-info
, the "Run Loop Time" and "Build Time" can add up to a total time larger than the wall clock time.
This is because in start_simulation , start_build
is set before the first barrier is called. This creates a situation where the first task is considered to be "simulating" while all other tasks are still building.
Update: our team hasn't run tests to ensure this assessment is correct, so the following may be more precise:
This ~is~ could be because in start_simulation
...
In start_simulation ,
start_build
is set before the first barrier is called. This creates a situation where the first task is considered to be "simulating" while all other tasks are still building.
Is that understanding of how things run correct? The assumption is that you can subtract the build time from the total time to get the run time. If there is some part of sst-core metrics that I don't understand then I could be wrong and this isn't a bug.
The runtime timings in the core were put in to get an approximate idea of how the various phases of the runtime were split up. There is no guarantee that the numbers will add up exactly to the runtime. The new profiling points are there to provide more accurate profiling information. That being said, things could probably be improved with the timings provided with -v and --print-timing-info.
@researcherben Is this still an issue?
@jwilso I'll check and get back you next week.
The breakdowns for timings in the code are as follows:
double start = sst_get_cpu_time();
...
double start_build = sst_get_cpu_time();
...
double start_run = sst_get_cpu_time();
info.build_time = start_run - start_build;
...
double end_time = sst_get_cpu_time();
info.run_time = end_time - start_run;
...
double total_end_time = sst_get_cpu_time();
double total_time = total_end_time - start;
Given that you're never going to get total_time = build_time + run_time even with 1 thread on 1 rank. When you throw in multiple ranks, threads it just grabs the max of each of those. max(run_time), max(total_time), max(build_time).
It may be a good idea to modify the output text somewhat to clear up any confusion.