perfview
perfview copied to clipboard
Use double instead of float to reduce accumulation of errors
After observing wide deviations between the displayed call stack metrics and reality (cases over 300% error were certainly observed), I finally traced the source of problems back to the use of float
instead of double
.
:memo: Care was taken to avoid changing serialization formats, but some public signatures were updated. I am not anticipating other semantic changes as a result of this. While this change increases the memory overhead of reviewing large traces, we already have a solution provided to accommodate users who need additional memory (PerfView64).
📝 Please make sure to not rebase or squash this change during the merge process.
Codecov Report
Merging #290 into master will increase coverage by
0.12%
. The diff coverage is28.48%
.
@@ Coverage Diff @@
## master #290 +/- ##
==========================================
+ Coverage 17.44% 17.56% +0.12%
==========================================
Files 213 213
Lines 123790 123790
Branches 11971 11971
==========================================
+ Hits 21590 21745 +155
+ Misses 101189 101186 -3
+ Partials 1011 859 -152
Flag | Coverage Δ | |
---|---|---|
#2017 | 17.56% <28.48%> (+0.12%) |
:arrow_up: |
#Debug | 17.56% <28.48%> (+0.12%) |
:arrow_up: |
#Release | ? |
Impacted Files | Coverage Δ | |
---|---|---|
src/TraceEvent/TraceEventSession.cs | 2.45% <ø> (ø) |
:arrow_up: |
src/PerfView/EventViewer/EventWindow.xaml.cs | 0% <0%> (ø) |
:arrow_up: |
src/HeapDump/GCHeapDump.cs | 0% <0%> (ø) |
:arrow_up: |
src/TraceEvent/Computers/ThreadTimeComputer.cs | 0% <0%> (ø) |
:arrow_up: |
src/PerfView/Utilities/CacheFiles.cs | 0% <0%> (ø) |
:arrow_up: |
src/PerfView/Dialogs/RunCommandDialog.xaml.cs | 0% <0%> (ø) |
:arrow_up: |
src/MemoryGraph/graph.cs | 0% <0%> (ø) |
:arrow_up: |
src/TraceEvent/TraceEventStacks.cs | 0% <0%> (ø) |
:arrow_up: |
src/PerfView/memory/ClrProfilerSources.cs | 0% <0%> (ø) |
:arrow_up: |
src/PerfView/memory/ClrProfilerParser.cs | 0% <0%> (ø) |
:arrow_up: |
... and 30 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update ed9cc8d...91efe97. Read the comment docs.
While I am OK with using doubles where the size of the number is not a concern, I am concerned that this seems to be a fairly blanket change from float to doubles. In particular a CallTreeNodeBase size has gotten significantly larger and there a ALOT of these when creating a stack view.
There is a tradeoff here and I don't want to do it lightly. We should document the kinds of issues you are seeing, and we will need to see what the spectrum of options we have and their relative costs.
@vancem After reviewing the change, it doesn't appear that the sweeping approach is problematic except for the question of the 6 fields in CallTreeNodeBase
. However, the specific problem I encountered was cases where the inclusive and exclusive counts and metrics were failing to show the correct values (with very high error rates between the actual numbers and the displayed numbers). The folded metric is of less importance to me since I set the fold to 0, but the savings there seem somewhat marginal.
@vancem This problem is most easily observed in the following case:
- Open a large heap dump
- Extract the information without sampling
- Open the Heap Stacks view
- Set the Fold% to 0
- Set the GroupPats to empty
In other words, this is a "show all objects just as they are" view. In a large heap, all the metrics will cap out at a number much smaller than the true value.
Perhaps even more concerning to me is the cases where numbers are no so obviously incorrect. These numbers are used to drive business decisions, and considering the minimal impact this change has on the observed performance I believe it's reasonable to put accuracy first and then figure out other ways to make up the difference.
:memo: I have used this change with traces where the ETLX is at or very near the 4GiB boundary imposed by the file format, and haven't observed cases where memory consumption was unreasonable or caused a failure. I have likewise used it on heap dumps of 65million objects without sampling, grouping, or folding (unfiltered, unpruned stacks).
Updated after repository rewrite
Updated to address merge conflicts