Mobile Vitals Expansion for iOS
Problem Statement
Sentry currently presents 4 metrics for mobile developers. What we want to do is expand that to include additional measurements to help developers track their app's performance.
Solution Brainstorm
We are looking to include:
- Memory Utilization
- CPU Utilization
- FPS
- Battery level
- Internal/ available storage
@armcknight @indragiek The idea to solve this was to create a background worker that monitors this metrics. AFAIK this is what profiling already do. Can we hook into the profiling to capture this metrics? And how we do it?
This is what we have discuss so far:
- CPU Utilization
- Goals:
- distributed collection throughout transaction and display average / max/ standard deviation(?) of usage over transaction period
- Tradeoff(s): overhead - is calling OS API expensive
- Notes:
- std dev needs further research as to whether it’s possible
- Goals:
- Memory Utilization
- Goals:
- distributed collection throughout transaction and display average / max/ standard deviation(?) of usage over transaction period
- Tradeoff(s): overhead - is calling OS API expensive
- Notes:
- std dev needs further research as to whether it’s possible
- Goals:
- Battery Level ~~and Temperature~~
- Goals:
- delta of battery percentage
- only when not charging
- track percentage at beginning + end of transaction
- Tradeoff(s):
- if calling OS API is expensive
- Notes:
- % - doesn’t see much benefit, no decimals, 1 transaction
- Temperature is an indication of power consumption
- something that currently is extracted for errors, translate to transactions
- Temperature not being done across native apps because iOS only provides 4 nominal levels, no numeric value
- std dev needs further research as to whether it’s possible
- would be more perceivable at scale (larger apps,etc.)
- transactions that happen the most
- Goals:
- FPS
- Goals:
- display average / max/ standard deviation of FPS over transaction
- Tradeoff(s):
- don’t know if this is feasible yet
- Notes:
- less important because we have slow and frozen frames
- std dev needs further research as to whether it’s possible
- Goals:
- Internal Storage Size/ Available Storage
- Goals:
- delta from beginning to end of transaction
- capture avg and graph over time?
- Tradeoff(s):
- Notes:
- can this be narrowed down to the single app, not across the whole device (including impact of other apps)
- Goals:
There is currently very little information available in this ticket. What exactly do we want to track, in what format do we send that data, can we hook into the profiling code already there?
For example let's talk about CPU usage. We want to send average / max / standard deviation of CPU usage during the lifetime of the transaction. So do we hook into the profiling code and keep an array of CPU usage, and then calculate those 3 values and send them along with the transaction? In what format, what does the payload look like?
Same goes for memory usage. Do we measure the app memory usage during the lifetime of the transaction (like every 100 ms or something like that)? Or do we only need the start and end value? Or do you want to know the peak? But if you measure every x ms, you might still miss the peak of course. And is an average really interesting? You probably are mostly interested in the start and end values, to see if it increased. But that might be unrelated to the work happening in this transaction of course.
I think for CPU usage, we need to keep track of wall and CPU time. If you don't know the difference, please check https://en.wikipedia.org/wiki/Elapsed_real_time.
@armcknight, I'm pretty sure we already do that for profiling. Can you point us to the code, please?
Furthermore, @armcknight, does the profiling code also keep track of memory usage? If so, please show us the code.
I think this could answer the time questions https://kandelvijaya.com/2016/10/25/precisiontiminginios/. Also worth looking at this PR https://github.com/getsentry/sentry-cocoa/pull/2105.
The profiler currently will collect backtraces per thread on a sampling interval: https://github.com/getsentry/sentry-cocoa/blob/e3d2bc4d5bf2c6d8d4f47ba4537936a8286fb05e/Sources/Sentry/SentryProfiler.mm#L151
and https://github.com/getsentry/sentry-cocoa/blob/e3d2bc4d5bf2c6d8d4f47ba4537936a8286fb05e/Sources/Sentry/SentrySamplingProfiler.cpp#L51
Each backtrace has a time appended to it, and we postprocess this in the backend to calculate frame durations. Here is where the backtrace time is recorded: https://github.com/getsentry/sentry-cocoa/blob/e3d2bc4d5bf2c6d8d4f47ba4537936a8286fb05e/Sources/Sentry/SentryBacktrace.cpp#L114
and the absoluteTime implementation: https://github.com/getsentry/sentry-cocoa/blob/e3d2bc4d5bf2c6d8d4f47ba4537936a8286fb05e/Sources/Sentry/SentryTime.mm#L11-L18
@kevinrenskers, could you please play around with the above code to find out how expensive it is to get the current CPU overhead, similar to what we do with slow and frozen frames?
Maybe we can use clock for the CPU time. Be aware that clock might return a higher number than getAbsoluteTime, if multiple cores of a process are active.
This has been superseded by mobile starfish and other endeavors.