devel-nytprof icon indicating copy to clipboard operation
devel-nytprof copied to clipboard

Please add histograms for timing data [rt.cpan.org #81096]

Open timbunce opened this issue 10 years ago • 1 comments
trafficstars

Migrated from rt.cpan.org#81096 (status was 'open')

Requestors:

From [email protected] on 2012-11-10 20:32:55:

NYTProf provides average time per call in its reports, like so:

# spent 31.1ms (2.60+28.5) within Eval::Closure::eval_closure which was called 149 times, avg 
209µs/call:
#126 times (2.31ms+25.9ms) by Class::MOP::Method::Generated::_compile_code at line 63 of 
Class/MOP/Method/Generated.pm, avg 224µs/call
#  23 times (292µs+2.64ms) by Moose::Meta::TypeConstraint::_actually_compile_type_constraint 
at line 320 of Moose/Meta/TypeConstraint.pm, avg 128µs/call

However, averages hide a lot of data. One of the lessons learned from DTrace, eg in 
http://dtrace.org/blogs/brendan/2011/12/18/visualizing-device-utilization/ , is that averages 
hide many types of interesting data: the rare spike, bimodal distributions, etc. At the Moving to 
Moose Hackathon, we saw a practical example of how NYTProf's averages were not telling the 
whole story. regcomp was taking a significant chunk of a program's time, which NYTProf of 
course pointed out like a champ. But we scratched our heads thinking it was a bug in Perl until 
we generated a histogram using DTrace, which showed that 3/4ths of the regcomps were super 
fast, and 1/4th of them were many orders of magnitude slower. Once we attained that bit of 
information, we instantly saw that the problem was a particular regular expression, and tweaked 
it to not expose the problem. I'm sure there are many other examples of where a histogram 
would highlight performance bottlenecks.

So it would be fantastic if NYTProf's reports included a histogram of how long each call took. 
DTrace again is a good model for how to render this kind of data: list each power of 2 in the first 
column, representing milliseconds. Then, in the next column, list how many times the number of 
calls that took that chunk of time (so for example, there might be 500 calls between 16-31ms, 
and 4 calls between 1024 and 2048ms). Then you can also render that information visually. 
Here's an example of DTrace's output from quantize (note the first and last values will generally 
have no hits, they're just there to bookend the data).

           value  ------------- Distribution ------------- count
           32768 |                                         0
           65536 |@@                                       1
          131072 |@@@@@@@@@@@@@@@@@@@@                     9
          262144 |@@@@@@@@@@@@@@@@                         7
          524288 |                                         0
         1048576 |@@                                       1
         2097152 |                                         0

I'd be very happy to implement this if you're interested in the idea and you could guide me!

From [email protected] on 2012-11-13 22:51:57:

I agree that this is a good idea.

Currently the subroutine profiler stores its data in an array held in a hash that's keyed by 
caller_subname[fid:line].

See incr_sub_inclusive_time(). Especially subr_call_av and the NYTP_SCi_* macros.

timbunce avatar Feb 22 '15 13:02 timbunce

@nrr is interested in this, as am i. Just poking this for papertrail and possible neartime investigation.

wchristian avatar Apr 04 '15 11:04 wchristian