puffin
puffin copied to clipboard
GlobalProfiler::lock().new_frame() can be very expensive due to compression
Here's a small cutout from a Superluminal profile of an app that uses puffin scopes quite extensively:
The pack()
call here takes around 1ms.
Looks like the pack()
call from add_frame()
gets very expensive. Partly because of zstd compression (can we turn down the compression level?) Also bincode serialization doesn't look cheap, maybe there's something faster we could switch to?
Ideally, it would be nice to run pack() on a separate thread, not sure how easy that would be but sure would help "main thread" performance.
#83 was supposed to solve this but pack passes still ran, see #87. When that's in, we can close this.
Don't think we've seen this since #87 was merged in, so closing