tsung
tsung copied to clipboard
tsung.log contains duplicate time stamps
Disclaimer: I'm running a slightly patched version of https://github.com/processone/tsung/commit/d98c21b7bddd09de399034dce9ccf5f71d881922. I cannot yet confirm that the problem still exists. But as far as I can tell, nothing was changed that could have fixed that.
For some time now I observe a duplicate timestamps in tsung.log. This happens in about 1% of tsung runs as far as I can tell (with >10k runs I analysed).
...
# stats: dump at 1426014899
...
# stats: dump at 1426014899
EndMonitor:{1426,14899,543959}
I guess this happens as data get's flushed once more before the test closes. The timestap in the tsung.log marker has only a second resolution which could explain this.
In many cases the last snapshot does not contain new monitoring data, but too often the last two snapshots contain indeed different data, which is kind of bad when analysing the log file. Take the example at the end and look at request.
I have no additional logging (log level warn). The only thing I can tell is, that I abort tsung (by limiting the arrival phase duration):
=INFO REPORT==== 10-Mar-2015::19:14:59 ===
ts_config_server:(0:<0.74.0>) Tsung test max duration reached, exits !
=INFO REPORT==== 10-Mar-2015::19:14:59 ===
ts_mon:(0:<0.75.0>) Aborting by request !
I have no means to reliably reproduce this and I'm not quite sure how to start debugging this. I'm grateful for any hints leading in the right direction.
# stats: dump at 1426014899
stats: users 500 500
stats: {recvpackets,"tsung_controller@HOST"} 1 2430.0 0.0 4331 2035 2991.8487394957983 119
stats: {sentpackets,"tsung_controller@HOST"} 1 2139.0 0.0 3900 1867 2541.5294117647068 119
stats: {cpu,"tsung_controller@HOST"} 1 0.3010536879076769 0.0 20.07483466759485 0.1006036217303823 1.150046325215228 120
stats: {load,"tsung_controller@HOST"} 1 0.171875 0.0 1.44140625 0.0 0.11259765624999998 120
stats: {freemem,"tsung_controller@HOST"} 1 3408.19921875 0.0 3439.26171875 3406.05078125 3416.849641927083 120
stats: session 0 0 0 1800.406982421875 1800.406982421875 1800.406982421875 1
stats: users_count 0 501
stats: finish_users_count 0 1
stats: connect 36 140.3225843641493 493.2737116389726 63295.9208984375 0.44921875 825.8163468022112 3499
stats: request 1648 1650.6232963487946 3116.933350529512 64357.56591796875 28.64892578125 1593.2868009954059 163551
stats: 504 53 4594
stats: page 981 2876.637955380748 4320.156413450077 78185.60107421875 29.821044921875 2660.723369609111 97847
stats: 200 1503 152310
stats: size_rcv 6400772 631622182
stats: 422 0 47
stats: size_sent 646872 67662066
stats: connected -6 490
stats: 301 0 37
stats: 404 63 4663
stats: 500 29 3548
# stats: dump at 1426014899
stats: users 500 500
stats: {recvpackets,"tsung_controller@HOST"} 0 0 0 4331 2035 2987.1666666666665 120
stats: {sentpackets,"tsung_controller@HOST"} 0 0 0 3900 1867 2538.175000000001 120
stats: {cpu,"tsung_controller@HOST"} 0 0 0 20.07483466759485 0.1006036217303823 1.1430298571383062 121
stats: {load,"tsung_controller@HOST"} 0 0 0 1.44140625 0.0 0.11308755165289255 121
stats: {freemem,"tsung_controller@HOST"} 0 0 0 3439.26171875 3406.05078125 3416.778150826446 121
stats: session 0 0 0 1800.406982421875 1800.406982421875 1800.406982421875 1
stats: users_count 0 501
stats: finish_users_count 0 1
stats: connect 1 55.80908203125 0.0 63295.9208984375 0.44921875 818.8353636486694 3535
stats: request 45 2340.7662109374996 3547.016206297924 64357.56591796875 28.64892578125 1593.8587811184236 165199
stats: page 24 4105.390706380209 4180.252407024528 78185.60107421875 29.821044921875 2662.86661047852 98828
stats: 504 6 4600
stats: 200 36 152346
stats: size_rcv 150525 631772707
stats: 422 0 47
stats: size_sent 13722 67675788
stats: connected 1 491
stats: 301 0 37
stats: error_abort 1 1
stats: 404 1 4664
stats: 500 2 3550
EndMonitor:{1426,14899,543959}
As I understood the situation now, I think that the last snapshot is just a "flush" that happens within the same second. To desitingish the snapshots, we could make the timestamp (# stats: dump at $timestamp) more precise to make it easier.
Beside possible hickups and confusion, I'm not sure if this is a bigger issue or not.
For me this is a bug. The reason is clear, but still this is quite confusing.
I'd suggest to make the timestamp more precise to at least be able to identify that this is in fact not the same. This should not really pose a problem for tsung.log parsers, but that should be a very easy adaptation.
What do you think, @nniclausse? If you agree, I'd prepare a PR.