perfview icon indicating copy to clipboard operation
perfview copied to clipboard

Use double instead of float to reduce accumulation of errors

Open sharwell opened this issue 7 years ago • 7 comments

After observing wide deviations between the displayed call stack metrics and reality (cases over 300% error were certainly observed), I finally traced the source of problems back to the use of float instead of double.

:memo: Care was taken to avoid changing serialization formats, but some public signatures were updated. I am not anticipating other semantic changes as a result of this. While this change increases the memory overhead of reviewing large traces, we already have a solution provided to accommodate users who need additional memory (PerfView64).

📝 Please make sure to not rebase or squash this change during the merge process.

sharwell avatar Jun 14 '17 15:06 sharwell

Codecov Report

Merging #290 into master will increase coverage by 0.12%. The diff coverage is 28.48%.

@@            Coverage Diff             @@
##           master     #290      +/-   ##
==========================================
+ Coverage   17.44%   17.56%   +0.12%     
==========================================
  Files         213      213              
  Lines      123790   123790              
  Branches    11971    11971              
==========================================
+ Hits        21590    21745     +155     
+ Misses     101189   101186       -3     
+ Partials     1011      859     -152
Flag Coverage Δ
#2017 17.56% <28.48%> (+0.12%) :arrow_up:
#Debug 17.56% <28.48%> (+0.12%) :arrow_up:
#Release ?
Impacted Files Coverage Δ
src/TraceEvent/TraceEventSession.cs 2.45% <ø> (ø) :arrow_up:
src/PerfView/EventViewer/EventWindow.xaml.cs 0% <0%> (ø) :arrow_up:
src/HeapDump/GCHeapDump.cs 0% <0%> (ø) :arrow_up:
src/TraceEvent/Computers/ThreadTimeComputer.cs 0% <0%> (ø) :arrow_up:
src/PerfView/Utilities/CacheFiles.cs 0% <0%> (ø) :arrow_up:
src/PerfView/Dialogs/RunCommandDialog.xaml.cs 0% <0%> (ø) :arrow_up:
src/MemoryGraph/graph.cs 0% <0%> (ø) :arrow_up:
src/TraceEvent/TraceEventStacks.cs 0% <0%> (ø) :arrow_up:
src/PerfView/memory/ClrProfilerSources.cs 0% <0%> (ø) :arrow_up:
src/PerfView/memory/ClrProfilerParser.cs 0% <0%> (ø) :arrow_up:
... and 30 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update ed9cc8d...91efe97. Read the comment docs.

codecov-io avatar Jun 14 '17 17:06 codecov-io

While I am OK with using doubles where the size of the number is not a concern, I am concerned that this seems to be a fairly blanket change from float to doubles. In particular a CallTreeNodeBase size has gotten significantly larger and there a ALOT of these when creating a stack view.

There is a tradeoff here and I don't want to do it lightly. We should document the kinds of issues you are seeing, and we will need to see what the spectrum of options we have and their relative costs.

vancem avatar Jun 15 '17 14:06 vancem

@vancem After reviewing the change, it doesn't appear that the sweeping approach is problematic except for the question of the 6 fields in CallTreeNodeBase. However, the specific problem I encountered was cases where the inclusive and exclusive counts and metrics were failing to show the correct values (with very high error rates between the actual numbers and the displayed numbers). The folded metric is of less importance to me since I set the fold to 0, but the savings there seem somewhat marginal.

sharwell avatar Jun 18 '17 03:06 sharwell

@vancem This problem is most easily observed in the following case:

  1. Open a large heap dump
  2. Extract the information without sampling
  3. Open the Heap Stacks view
  4. Set the Fold% to 0
  5. Set the GroupPats to empty

In other words, this is a "show all objects just as they are" view. In a large heap, all the metrics will cap out at a number much smaller than the true value.

Perhaps even more concerning to me is the cases where numbers are no so obviously incorrect. These numbers are used to drive business decisions, and considering the minimal impact this change has on the observed performance I believe it's reasonable to put accuracy first and then figure out other ways to make up the difference.

sharwell avatar Nov 03 '17 16:11 sharwell

:memo: I have used this change with traces where the ETLX is at or very near the 4GiB boundary imposed by the file format, and haven't observed cases where memory consumption was unreasonable or caused a failure. I have likewise used it on heap dumps of 65million objects without sampling, grouping, or folding (unfiltered, unpruned stacks).

sharwell avatar Nov 13 '17 19:11 sharwell

Updated after repository rewrite

sharwell avatar Feb 02 '18 12:02 sharwell

Updated to address merge conflicts

sharwell avatar Nov 22 '19 22:11 sharwell