drmemtrace timestamp and cpu_id are documented as before but are really after their buffer contents
The drcachesim trace buffer unit header contains a timestamp and a cpu_id. These are collected when we're flushing the buffer (https://github.com/DynamoRIO/dynamorio/blob/3f1d9f585d6b32d63f349345b968fd65312562b9/clients/drcachesim/tracer/tracer.cpp#L1135). So, even though they are in the header, these fields are more accurate for the last instruction in the buffer.
We should either update the documentation to be more accurate (like https://github.com/DynamoRIO/dynamorio/blob/3f1d9f585d6b32d63f349345b968fd65312562b9/clients/drcachesim/common/trace_entry.h#L242), or modify code to collect these values (timestamp, cpu_id, etc) when we start a new buffer, save it in a thread-local variable and use them to create the header when we're writing the buffer to storage.
dynamorio-users discussion: https://groups.google.com/g/dynamorio-users/c/AxOo5PMuEQU
I would not expect this to matter much: buffers are output prior to syscalls or signals, so at most this is a couple thousand instructions. Given the granularity of accuracy overall and use by analyzers/simulators of the timestamps and cpu_id I don't see it having much effect one way or the other.
Could thread preemption cause large delays? Or would that be rare since we are going into the kernel for IO frequently?
For -L0_filter traces we could have larger delays since trace entries are written less frequently.
I was also trying to use the timestamps to "visualize" the traces but we no longer have the information on gaps.
Suppose a thread tid0 was running on CPUx and then migrated to CPUy and then back to CPUx in the following manner where * indicates that tid0 was running and . indicates something else was running or CPU was idle
CPUx *****............................*******..
CPUy ........*****************.................
We can only get this information from the trace markers. Also, since the timestamps are in microseconds many consecutive markers occur at the "same time".
CPUx ....*..................................*..
CPUy ........................*.................
Would it make sense to record both start and stop timestamps and record finer grained timestamps (e.g. using rdtsc)
The recorded schedule with the tracing overhead and coarse-grain snapshots will always have accuracy limitations, but your suggested improvement of a stop timestamp is reasonable. If you wanted to implement that and submit a pull request that sounds ok to me. You'd add the stop timestamp as another marker in the buffer header I would assume.
For rdtsc, there may be concerns about TSC synch across multi-processor setups: maybe these days these are less common corner cases?
We have not generally been using the recorded schedule and are using a synthetic schedule of the software threads in many cases, especially when simulating on a different number of cores.
This is being solved in #2039 where we move timestamps to buffer start