sentry-cocoa icon indicating copy to clipboard operation
sentry-cocoa copied to clipboard

Periodically store breadcrumbs to disk for OOMs

Open philipphofmann opened this issue 3 years ago • 2 comments

Description

We could store the breadcrumbs, for example, every 5 seconds to the disk, and when an OOM occurs, we could attach these. Furthermore, the breadcrumb for UIApplicationDidReceiveMemoryWarningNotification could contain the current memory footprint. We could also keep a file stream open and just attach the breadcrumb to an open file.

Check the sentry-native SDK for a similar implementation. You can reach out to @bruno-garcia for details.

philipphofmann avatar Apr 13 '22 13:04 philipphofmann

Make sure to add an entry to the crumbs list when loading from disk to indicate the app has restarted. The goal is to be clear that "behind this crumb, entries happened on a previous app execution".

This can also be useful for #316 too. Possibly to any issue, we always load up the last runs' crumbs into memory on startup so the app starts with some info of what was going on before this run

bruno-garcia avatar Apr 20 '22 13:04 bruno-garcia

Related to https://github.com/getsentry/sentry-java/issues/547

marandaneto avatar Apr 22 '22 08:04 marandaneto

Let's see if we can write the breadcrumbs to a file as they come in, in a background thread, instead of doing it periodically. Keep the file open, and append to it.

After all, it's especially the latest breadcrumbs that are going to be important to have, to figure out what is causing the issue, which are exactly the ones you're going to miss out on when writing periodically.

kevinrenskers avatar Oct 27 '22 13:10 kevinrenskers

There are a few problems with appending the breadcrumbs to a file:

  1. Can't deserialize them back to breadcrumbs in one go, as we're not writing a valid JSON array to disk. Instead we're writing many JSON dictionaries to the file, one per line, which have to be deserialized one by one. On the other hand, not sure if that is actually much slower? You're still deserializing the same amount of bytes in the end, only now smaller pieces in a loop.
  2. Can't limit the amount of breadcrumbs written to the file to 100, because we're only appending to the file. Yes we create a clean file on SDK start but even then the file can grow pretty big very rapidly when every single click is a breadcrumb. When reading back the file in the case of OOM we can of course throw away everything except the last 100 lines but it's not very efficient. Too much overhead when the file ends up with thousands of breadcrumbs (potentially).

But what's the alternative? Keep an in-memory array of max 100 breadcrumbs, and write that to disk every time a breadcrumb comes in? That's a big hit to performance due to IO, even just to serialize that array on every breadcrumb is a lot. We could go back to writing this array to disk periodically, but then we're back to the problem where the most important data might be lost in the case of OOM.

So I think it makes sense to go on with the current approach of append breadcrumbs to an open file. I think point 1 is not really an issue, as I do think deserializing many small items probably takes about the same amount of time as deserializing one big array of the same amount of items (I will measure this!). But the fact that the file will grow and grow and grow.. hmm.

You can only append to a file, anything else means a complete rewrite of the entire file. So keeping a counter in memory and once that hits 100 remove the first line of the file, that just doesn't work. We could maybe split the breadcrumbs across multiple files: write 100 breadcrumbs to a file, and then create a new file and start writing to that. Then we only have to keep the 2 most recent files to always have a maximum of 200 breadcrumbs stored on disk. More state to keep track of, but it would keep the files small, allow the least IO overhead by still using "append to file" instead of full writes, and reading back the data and deserializing it has the least amount of overhead because we're only dealing with max 200 lines.

Sounds like a pretty good solution to me, what do you guys think?

kevinrenskers avatar Oct 31 '22 21:10 kevinrenskers