hledger
hledger copied to clipboard
prevent excessive memory usage
Or: Don't let a typo hang user's machine
Eduardo reported in chat:
I have made a typo in a transaction. The year was 20211 instead of 2021. Later, I tried to generate a balance report with --daily. My CPU and memory usage went to the roof and my computer didn't freeze because I stopped it soon. Can we consider this a bug or only a user error?
We must consider it both I think - it's not very good if it's this easy to hang your machine with hledger. Here are some tasks that come to mind:
- Reproduce and study (quantify) this case
- Search for and catalog other ways to cause the same result
- Optimise them so they perform well enough (& decide what that means), if possible
- As a fallback, try to detect operations likely to exceed available memory/acceptable resource usage and fail with an explanation ? (Process no more than 1m transactions per 1G system ram, or similar ? There could also be an interactive warning prompt, but we don't like these by default..)
- Allow this to be overridden with a --no-memory-check flag.
Any thoughts ?
As I understand it, the crux of this error is that through a simple user error, we can generate so many computer-generated postings that the CPU hangs. There are three types of computer-generated postings.
- Auto-postings
- Periodic postings
- Summary postings
In this case it looks like summary postings are the proximate cause, but any of these others will compound the problem, resulting in 1, 2, or 4 traversals over the whole list of postings/transactions.
I think that summary postings with no content probably don't need to actually be stored, but can be generated on-demand at display time. Is this the case now, or do they continue to live in memory?
Another problem might be the amount of time to display large amounts of text on the terminal. Is this the case here, or does it hang before that stage?
Otherwise, I think the performance considerations we've been discussing regarding periodic and auto-postings apply here as well.
Terminology police - I'd say automatic postings (generated by --auto), periodic transactions (generated by --forecast/--budget), and summary transactions (generated by -D/-W/-M/-Q/-Y/--period).
We don't know the exact command yet - Eduardo posted briefly on 2021-09-03. I suspect you're right, it was probably summary transactions generated by balance --daily with report end date defaulting to 20211-01-01. (I'm not on a computer, or I'd try it.)
Auto postings are only added to existing or generated transactions, and it's hard to generate a large number of them per transaction - so I think we can concentrate on ways to generate large numbers of periodic transactions (a periodic rule, report period, or forecast period causing a long period containing many transaction occurences) or summary transactions (journal dates or report period causing a long period with many short subperiods).
I agree that it's hard to generate a lot of auto-postings themselves, but the presence of any auto-postings at all effectively doubles the number of postings in the journal due to the requirement to do a second balancing pass over everything.
A note to keep in mind. One possibility that might improve things is to use a proper streaming library like streamly or conduit.
I'm not sure if this is the issue here, but something I've just noticed.
PostingsReport will generate summary reports by generating the list of intervals it needs to report on, then for each such interval filtering the whole list of postings based on whether they lie in that interval. So if we have n intervals this requires n traversals of the whole list of postings. For a daily report up to 20211, this would be 6.6 million traversals of the list of postings.
Provided we can assume that the list of postings is sorted by date (either because we explicitly sort it or because it already is), we can get this down to just one traversal.
MultiBalanceReport will create a Map DateSpan [Posting] so it should not suffer from this issue, though it will have to keep them all in memory to create the Map.
Nice! A profile will show this clearly.
It would be helpful if we knew which command was being called in the original error report, and a little bit about the size of the journal involved and other options, but regardless I'm seeing big slowdowns even with a simple journal here.
Another potential fix: we can restrict the date parser to only allow 4-digit years. As much as I hate contributing to the Y10k problem, it may be a worthwhile trade-off.
No way! We want to be able to model far-future cosmic events. :)
Another possibility: we could disallow more than n reporting intervals, and throw an error when they are requested. If we choose n = 10 000 that means the error would only be thrown if somebody requests a daily report over more than 27 years. Choosing n = 1 000 000 allows daily reports up to 2739 years, and can still be handled pretty comfortably within memory.
I'm going to keep this open for now. Let's try to confirm this one with Eduardo and keep looking for other ways to use excessive memory.
A valuable related resource: https://haskell.foundation/hs-opt-handbook.github.io/