hledger
hledger copied to clipboard
date parsing is not aware of non-english month names
Reported by pablo1107 in #plaintextaccounting: a date-format in CSV rules did not perform as expected:
~/ledger master*
❯ export LC_TIME="es_AR.UTF-8"
~/ledger master*
❯ hledger print --rules-file vacaciones-tc.rules -f vacaciones-tc.csv
hledger: error: could not parse "30 abr 2022" as a date using date format "%d %h %Y"
record values: "30 abr 2022","MERCADOPAGO*QUILMESROCK 02/06 ","ARS +383,33"
the date rule is: %1
the date-format is: %d %h %Y
you may need to change your date rule, change your date-format rule, or change your skip rule
for m/d/y or d/m/y dates, use date-format %-m/%-d/%Y or date-format %-d/%-m/%Y
In fact hledger has no awareness of the system time locale / $LC_TIME; it hard codes the en_US time locale (as do essentially all Haskell programs, I would guess).
This makes it difficult to parse CSV containing non-english month/day names. Possibly it also manifests in other ways, though I don't know of any.
https://hackage.haskell.org/package/env-locale seems to be the way to get the system time locale. Eg:
#!/usr/bin/env stack
-- stack runghc --verbosity info --package time --package env-locale
import Data.Time
import Data.Time.Format
import System.Locale.Current
main = do
ctl <- currentLocale
d <- parseTimeM False ctl "%b" "abr" :: IO Day
print d
$ export LC_TIME=es_AR.UTF-8
$ ./a.hs
1970-04-01
This is a slight can of worms though, similar to the text encoding discussion at #1834 etc. Mainly, if this capability is desirable, and it seems so, what is the best default providing predictability, convenience for reading your local data, and convenience for reading foreign-language data. Probably we can follow however we decide to handle text encoding.