hledger icon indicating copy to clipboard operation
hledger copied to clipboard

Balance transactions more robustly and compatibly

Open simonmichael opened this issue 6 months ago • 4 comments

When checking if a transaction is balanced, until now hledger has used a commodity's global display precision as its balancing precision. This turns out to be not ideal; one familiar annoyance is that increasing the display precision (eg using -c to show more decimals) can cause unbalanced transactions, making the journal unreadable. Another is that journals exported from Ledger or Beancount can be rejected as unbalanced by hledger, until commodity directives are added to help control the balancing precision.

This PR makes hledger use only the local transaction's precisions for balancing, like Ledger and Beancount. Ie, the precisions inferred from the (non-cost) amounts in the transaction being balanced. This nicely separates balancing and display, and makes us better at exchanging journal entries with Ledger and Beancount.

simonmichael avatar Jun 08 '25 03:06 simonmichael

Hmm, but while more robust in principle, the new balancing strategy can reject some entries that we previously accepted.

I assume those will be easy to fix manually by making their decimals a little more correct. But it won't be ok to suddenly require people to do that. Perhaps we'll need to offer (or try) both behaviours ?

simonmichael avatar Jun 08 '25 04:06 simonmichael

This gets complicated, so I've written up my understanding of how https://joyful.com/PTA+transaction+balancing works in Ledger/hledger/Beancount, and some issues with hledger's current behaviour. The nub for this PR is this:

Problem: this reveals inexact entries which were previously masked by commodity directives, breaking the journal.
Solution ? Allow the local balance-checking precisions to be reduced by commodity directives, for backward compatibility.
Ie commodity directives could make balance checking less precise (only).
If entries are exported without the commodity directive, they would be revealed as invalid, and that would be a time to fix them.

simonmichael avatar Jun 08 '25 23:06 simonmichael

I have tested the new behaviour a little bit with my files. So far, I found only a few transactions per year which are now considered unbalanced, and it has been relatively easy to fix them manually (adding an expenses:rounding posting with no amount is one way).

But I can see this being tricky/annoying for users after upgrading. I wonder if we need to keep the old behaviour, and maybe an improved transitional behaviour, perhaps something like this:

--txn-balancing=

  old - use precision inferred from the whole journal, overridable by commodity directive or -c
    legacy behaviour; compatible with hledger <=1.43
    display precision is also transaction balancing precision; increasing it breaks journal reading

  compat - use precision inferred from the transaction, reducible by commodity directive
    more robust when there is no commodity directive
    reducing the display precision via commodity directive, also reduces transaction balancing precision; can fix journal reading
    increasing the display precision does not break journal reading
    compatible with ledger, beancount, hledger <=1.43

  robust - use precision inferred from the transaction
    most strict - old transactions may need to be adjusted
    simplest, most robust overall ?
    display precision and transaction balancing precision are independent; display precision never affects journal reading
    compatible with ledger, beancount

simonmichael avatar Jun 10 '25 05:06 simonmichael

It's really cool but for compatibility. Maybe we should add a command line parameter to specify the global balancing strategy. This way users can force hledger to balance in a specific way

We can also add a declarative syntax command to determine which balancing strategy should be used for each part in the journal syntax

jack9603301 avatar Jun 10 '25 06:06 jack9603301

Though breaking people's working journals is bad, creating more complications by trying to avoid that is also bad. So I've gone with a simpler toggle:

     --txn-balancing=...    how to check that transactions are balanced:
                            'old':   - use global display precision
                            'exact': - use transaction precision (default)

Here's the doc:

Transaction balancing

How exactly does hledger decide when a transaction is balanced ? Especially when it involves costs, which often are not exact, because of repeating decimals, or imperfect data from financial institutions ? In each commodity, hledger sums the transaction's posting amounts, after converting any with costs; then it checks if that sum is zero, when rounded to a suitable number of decimal digits - which we call the balancing precision.

Note, this changed with hledger 1.44. Older hledger versions reused display precision as balancing precision, which causes problems (over-reliance on commodity directives; fragility; see issue #2402). Since 1.44, hledger infers balancing precision in each transaction just from the amounts in that transaction. Eg when checking the balance of commodity A, it uses the highest decimal precision seen for A in that transaction's journal entry (excluding cost amounts). This makes transaction balancing more robust, and improves our ability to read journals exported from Ledger and Beancount, and vice versa.

Unfortunately it can also reject some journal entries that worked with older hledger. This might also happen when converting CSV files. If you hit this problem, here's how to fix it:

  • You can add --txn-balancing=old to the command, or to your ~/.hledger.conf file. This restores the pre-1.44 behaviour, allowing you to keep using old journals unchanged.

  • Or you can fix the problem entries. There are three common ways; use whichever seems easiest/best:

    1. make cost amounts more precise (eg adding more decimal digits)
    2. or make non-cost amounts less precise (removing unnecessary decimal digits that are raising the precision)
    3. or add one amountless posting to absorb the imbalance (eg to "expenses:rounding").

If you see any problems with this, let me know.

simonmichael avatar Jun 10 '25 23:06 simonmichael

To recap, the reason for changing things is to solve these problems:

  • hledger's transaction balancing (and journal reading) is quite dependent on display precisions; and more specifically, on commodity directives, amounts in other transactions, and even amounts in P directives. These things should be independent.
  • hledger's -c/--commodity-style option is handy for seeing extra decimal places when you need to; but typically it can't be used for that because it'll break transaction balancing.
  • hledger may initially reject entries converted from Ledger or Beancount, until you add commodity directives to control the balancing precision; this is unintuitive.
  • hledger entries (with costs) can have an imbalance that's masked by commodity directives limiting balancing precision; this is usually harmless, but eg if exported to Ledger or Beancount they will be rejected. Also, requiring that every imbalance is visibly accounted for in the local journal entry seems preferable on principle.

simonmichael avatar Jun 10 '25 23:06 simonmichael

And this might justify a version jump to 1.50, signalling the unusual breakage ("Unfortunately it can also reject some journal entries that worked with older hledger"); I don't know.

simonmichael avatar Jun 11 '25 00:06 simonmichael

And to recap more: the compatibility workaround I was contemplating was to offer a third option, compat, as the default:

     --txn-balancing=...    how to check that transactions are balanced:
                            'old':    - use global display precision
                            'compat': - use transaction precision, reducible by directive (default)
                            'exact':  - use transaction precision

This would be a hybrid, primarily using the transaction precisions but if commodity directives specify a smaller display precision, using that instead. This would avoid breaking old journals, while also being better behaved than old.

There'd be no point to compat unless we made it the default. But then most people would stick with the current fragile behaviour for ever.

simonmichael avatar Jun 11 '25 16:06 simonmichael

From chat:

Q: does that mean, with the old approach, accounts can theoretically accumulate tiny errors over time?

A: that kind of error does not arise, at least because we always reconcile bank accounts. I guess typically it's the cost amount that's inexact, the data we are importing or entering will have a correct bank/fiat amount (and if not we'd notice)

I guess I can check for this now with my own data.

hledger bal -5 cur:\\$ -c '$1.000000000000' | rg '\...0*[1-9]'

(Show account balances, limited to depth 5, limited to the $ currency, overriding $'s display style to show 12 decimal places. Then filter that to show just the lines where there's a non-zero decimal digit in the 3rd place or beyond.)

I do have a half cent balance in one asset account and one liability account, and in half a dozen expense accounts. Which is normally shown rounded to a cent.

And I have a small 15 digit decimal balance in equity:starting balances (from lot starting balances getting multiplied by their costs, producing more decimals).

So yes, with the old method I guess unaccounted-for remainders can accumulate, in accounts that (a) often have an inexact posting amount or cost amount and (b) are never reconciled - such as my equity, revenue, and expense accounts.

simonmichael avatar Jun 11 '25 16:06 simonmichael

New doc:

Transaction balancing

How exactly does hledger decide when a transaction is balanced ? Especially when it involves costs, which often are not exact, because of repeating decimals, or imperfect data from financial institutions ? In each commodity, hledger sums the transaction's posting amounts, after converting any with costs; then it checks if that sum is zero, when rounded to a suitable number of decimal digits - which we call the balancing precision.

Since version 1.44, hledger infers balancing precision in each transaction from the amounts in that transaction's journal entry (like Ledger). Ie, when checking the balance of commodity A, it uses the highest decimal precision seen for A in the journal entry (excluding cost amounts). This makes transaction balancing robust; any imbalances must be visibly accounted for in the journal entry, display precision can be freely increased with -c, and compatibility with Ledger and Beancount journals is good.

Note that hledger versions before 1.44 worked differently: they allowed display precision to override the balancing precision. This masked small imbalances and caused fragility (see issue #2402). As a result, some journal entries (or CSV rules) that worked with hledger <1.44, are now rejected with an "unbalanced transaction" error. If you hit this problem, it's easy to fix:

  • You can restore the old behaviour, by adding --txn-balancing=old to the command or to your ~/.hledger.conf file. This lets you keep using old journals unchanged, though without the above benefits.

  • Or you can fix the problem entries (recommended). There are three ways, use whichever seems best:

    1. make cost amounts more precise (add more/better decimal digits)
    2. or make non-cost amounts less precise (remove unnecessary decimal digits that are raising the precision)
    3. or add one amountless posting to absorb the imbalance (eg "expenses:rounding").

simonmichael avatar Jun 11 '25 18:06 simonmichael

I agree that display precision should be distinct from calculation precision, so I think the change is good.

The documentation is clear.

Does it notably affect performance for a 15 000-line journal? After the change, precision has to be determined for every transaction.

Minor remark: expenses:rounding does not work if there is already a balancing posting.

mvhulten avatar Jun 12 '25 15:06 mvhulten

Thanks @mvhulten. I tried to suggest there should be at most "one amountless posting", I'll clarify.

simonmichael avatar Jun 12 '25 16:06 simonmichael

A UX improvement: if it sees an unbalanced transaction that older hledger would have accepted, it adds an informative note.

5 | 2025-01-01
  |     a         $1.1206
  |     b         $-1.120

This transaction is unbalanced.
The real postings' sum should be 0 but is: $0.0006
Note, hledger <1.50 accepted this entry because of the global display precision,
but hledger 1.50+ checks more strictly, using the entry's local precision.
You can use --txn-balancing=old to keep it working, or fix it (recommended);
see 'Transaction balancing' in the hledger manual.

simonmichael avatar Jun 12 '25 19:06 simonmichael

Thanks for the comments; this has been merged, and included in the https://github.com/simonmichael/hledger/releases/tag/nightly binaries.

simonmichael avatar Jun 13 '25 06:06 simonmichael

Thanks @mvhulten. I tried to suggest there should be at most "one amountless posting", I'll clarify.

Perfect. And you're right: in retrospect the original formulation with one was just as fine as the more verbose formulation!

mvhulten avatar Jun 13 '25 07:06 mvhulten