sentry-dotnet
sentry-dotnet copied to clipboard
Support for profiling
Possibly through the same apis used by dotnet trace
https://docs.sentry.io/product/profiling/
This would be helpful for us
Came in on Discord:
Hi everyone - new user here 🙂 - Is there a recommendation for profiling on dotnet? We used the older version for a while and now with our signup noticed that dotnet isn't offered. Sorry if this is addressed elsewhere
After making a small change to the dotnet-trace command-line app to connect to its own process (ignore the --process-id arg and use Process.GetCurrentProcess().Id instead), it seems to be perfectly happy about profiling itself. See the attached profiles and a screenshot from speedscope
dotnet-trace.exe_20230216_122438.zip
With the dotnet-trace code being licensed under MIT, it seems like a good candidate for cherry-picking an in-process profiling implementation that could be part of the Sentry SDK. We would likely need a way to filter-out profiling-related events so they don't confuse people?
The CPU usage of the whole dotnet-trace executable as reported by the process monitor on my PC was reporting between 0.0 and 0.2 % - since it wasn't actually doing anything else than collecting the trace, it should represent the actual usage of sampler collection (and writing to file).
Status update:
I have rolled the nettrace processing directly in a fork of dotnet-trace (see the current working version here: https://github.com/vaind/diagnostics/tree/sentry-profiling) and while I am producing a JSON which I hope is correct, but I haven't had luck getting it to sentry.io yet. Likely the issue is with how I'm pushing the envelope with the profile manually (through sentry-CLI) and it doesn't get associated with the transaction. Or it's just invalid - can't tell at the moment because I don't see what's happening on the server :/
Two PRs need to get in in order to accept your profile:
- Add
donetas a platform: https://github.com/getsentry/relay/pull/1885 - Deprecation of some fields: https://github.com/getsentry/relay/pull/1878
Looking at your profile, a few things I saw you'd need to correct:
timestampneeds to be RFC3339 formatted, like2023-03-01T10:10:10.123456789+06:00(the more precision the better, down to the nanosecond if possible).os.buildshould beos.build_numberandos.build_numbershould beos.versiontransaction.idneeds to be auuid4without-- there's no field
idinthread_metadatavalues
Also, at what rate are you sampling? We use 101Hz in our other SDKs.
Note to self:
The raw addresses have to be associated with some entity that knows its symbolic name. At this point things work very differently for native code and code that is JIT compiled on the fly:
- For native code, TraceEvent must find the DLL that includes the code, for this it needs information about all DLLs that where loaded in the process and what addresses they are loaded at. These are the kernel ImageLoad events.
- For JIT compiled code it needs to know the code ranges of all JIT compiled methods. For this it needs special .NET or Jscript events specifically designed for this purpose.
If the necessary events are not present, the best that can be done is to show the address value as a hexadecimal number (which is not very helpful). Thus it is critical that these events be present. Complicating this is the fact that in many scenario of long running processes. If the process lives longer than the collection interval, then there can be image loads or JIT compilation that occurred before the trace started. We need these events as well. To get them the ETW providers involved support something called 'CAPTURE_STATE' which causes them to emit events for all past image loads or JIT compilations. The logic for capturing data must explicitly include logic for triggering this CAPTURE_STATE.
Both PRs have been merged and deployed so no more blocker on our side.
Closing this through #2206
Follow ups are: #2315 and #2316