Decoder for Time Information not found
Pilgrim uses several methods to compress time information (interval and duration). It seems there is no decoder for time information in pilgrim_app_generator.c. The output of time information to intervals.dat and durations.dat in pilgrim_logger.c makes me feel confused and have no ideas how to decode it. Could you please give some decode cases about them? Many thanks.
Pilgrim app generator(pilgrim_app_generator.c) is a code generator. Given the pilgrim traces, It tries to generate a C program that recovers the communication pattern. It relies on the order of the captured calls, not the detailed time information.
I just added an example of decoding timestamps in pilgrim2text.c. https://github.com/pmodels/pilgrim/pull/29 (You need to rebuild pilgrim and re-generate the traces)
So you may want to start from pilgrim2text.c. You can also try running pilgrim2text /path/to/your/trace-dir to see the outputs. The command will generate a _text directory under your traces directory.
By the way, the two papers below have all the details about Pilgrim: Near-Lossless MPI Tracing and Proxy Application Autogeneration Pilgrim: Scalable and (near) Lossless MPI Tracing
I have read your papers, and I am interested in Pilgrim. Your new pr helped me a lot. Many thanks. However, I found that sometimes I got segmentation fault when generating proxy using Pilgrim. The test program is flash, like sedov-3d, stirturb. It seems that sym->val may < 0 in some cases here.
void handle_one_symbol_pre(FILE* f, Symbol *sym, CallSignature *cst) {
if(wt_loop) {
if(sym->val >= 0)
wt_loop_count -= get_wt_completed_reqs(&cst[sym->val]);
/* sym->val may < 0 and then segmentation fault happens */
if(cst[sym->val].func_id != wt_loop_call_id) {
// .....
}
}
}
I try to return at once if sym->val < 0, but the proxy it generated will be messed up. Do you know how to fix this? Much appreciated in advance. Also, I found the generated proxy use same buf in MPI_Reduce for both send and receive, which result in a runtime error reported by openMPI. I made a small modification, like using a buf_recv, to avoid this problem.
Here is a case when I return immediately in handle_one_symbol_pre. The function for nonterm is empty, which is abnormal.
The buf_0 = malloc(10000000); is set by me since sometimes buf_0 = malloc(0); may happen.

Sorry for the late response, was totally occupied by other projects. Will take a look at this, but might take me a while.