hledger icon indicating copy to clipboard operation
hledger copied to clipboard

easy Ledger file reading

Open simonmichael opened this issue 2 years ago • 5 comments

Capturing this from chat:

Despite all the work on supporting Ledger syntax, the vast majority of Ledger users who try to read their file with hledger, fail and move on. Including the folks who don't use value expressions, and the folks who have been told the ledger print | hledger -f- ... trick.

you'd think that last would work - ledger print is always valid h/ledger syntax, right ? I think now there are two main snags:

needing to set LANG, because everyone has non-ascii and haskell programs throw up their hands if they see that without proper LANG

needing to add commodity directives or -c options, because ledger print adds decimal zeros forcing hledger to check transaction-balancedness more precisely than ledger does

definitely time we had a better strategy here

Wishes

  • Users can easily find out the requirements and workarounds for making a Ledger file hledger-readable
  • Support providers can easily, with low effort from the user, estimate the (in)compatibility level of user's Ledger file
  • When hledger fails to read a Ledger file, the reason is clear. Ideally it detects a Ledger file and gives a custom message for this scenario, not just the usual parse errors.
  • Causes of incompatibility, and any workarounds, are collected and documented in one place.
  • hledger reads all the Ledger syntax features that correspond to our data model
  • hledger ignores all other Ledger syntax features that can be ignored
  • the syntax features it doesn't read or ignore are few in number and clearly documented
  • (Or if this complicates hledger too much, there is a separate ledger2hledger tool for it.)

Actions

  • [x] Test ledger file reading more
    • [x] gather sources of ledger files
    • [x] test manually
    • [x] set up test automation
    • [x] characterise issues
    • [x] (gather clean examples/tests)
    • [ ] (set up some easy example contribution hub or workflow, like a pastebin or chatbot or command)
  • [x] Improve https://hledger.org/ledger.html
    • [x] better presentation of journal format differences
    • [x] show support status of each feature
    • [x] list common incompatibilities and workarounds
  • [ ] Improve Ledger file parsing
    • [x] identify features which are supported, are ignored, should be supported, should be ignored, should be rejected
    • [x] clarify enhancement priorities
      • support/ignore more directives (with warnings ?)
      • improve performance on sample file collection
      • review & improve error messages
      • more lot notation support (ledger & beancount)
      • possibly revive amount expressions PR
    • [x] collect test specimens
    • [x] clean up parsers as needed
    • [x] ignore all features which should be ignored
    • [x] reject all features which should be rejected
    • [ ] support all features which should be supported
    • [ ] decide/implement local-precision balancing
    • [ ] design and implement some kind of Ledger file detection
    • [ ] implement desired UX, custom messages
    • [ ] as part of our tests, handle properly all files from: ledger tests, ledger2beancount tests, collected examples
  • [ ] Improve locale handling
    • [ ] catalogue common locale-related startup exceptions/messages
    • [ ] review/consolidate IO paths triggering failure
    • [ ] implement graceful failure, catching exceptions
    • [ ] review/improve tests

Related

  • https://github.com/simonmichael/hledger/issues/258
  • https://github.com/simonmichael/hledger/issues/428
  • https://github.com/simonmichael/hledger/issues/1021
  • https://github.com/simonmichael/hledger/issues/1084
  • https://github.com/simonmichael/hledger/issues/1752
  • https://github.com/simonmichael/hledger/issues/1964
  • https://hledger.org/ledger.html#ledger-file-format-support-status
  • http://plaintextaccounting.org/quickref/

simonmichael avatar Dec 16 '22 23:12 simonmichael

Documented the transaction-balancing precision issue for users at https://hledger.org/ledger.html#incompatible-balancing

simonmichael avatar Dec 18 '22 05:12 simonmichael

https://github.com/simonmichael/hledger/tree/master/hledger/test/ledger-compat is the start of a test suite for Ledger file compatibility. It uses Ledger's functional tests as a source of diverse sample Ledger files, and others collected manually can be added over time. Let me know if you can think of another good source.

https://gist.github.com/simonmichael/052703b1641669bfe067c68b81f707cc is the categorised results of a test run.. easier to read in Emacs, but to summarise, we currently read about 80% of Ledger's tests' sample data files. The most frequent causes of read failure were amount expressions and lot notation. There was ~20 other distinct causes of failure as well.

simonmichael avatar Dec 21 '22 19:12 simonmichael

https://hledger.org/ledger.html#journal-format is a new status table.

simonmichael avatar Dec 22 '22 01:12 simonmichael

design and implement some kind of Ledger file detection

What do you think of https://github.com/ledger-rs/incubator/discussions/2? The main suggestion is the data format specified in the header, similar to shebangs. The program handling becomes easier with semantic versioning.

alensiljak avatar Dec 22 '22 11:12 alensiljak

@alensiljak seems a good idea. hledger uses file extension as a hint for input/output format also - .csv/.tsv/.ssv/.timeclock/.timedot/.journal/.hledger (/.ledger/.beancount/...)

simonmichael avatar Dec 23 '22 17:12 simonmichael