hledger icon indicating copy to clipboard operation
hledger copied to clipboard

support tsv: and ssv: file path prefixes, as well as csv:

Open simonmichael opened this issue 1 year ago • 1 comments

As reported in chat, -f tsv:- and -f ssv:- don't work like -f csv:-. That seems natural to expect, so we'd like to support it.

Currently hledger's file path prefixes must be a canonical format (internally: reader) name, and the name of the reader that handles CSV, SSV and TSV is csv.

[hledger internally conflates "reader" and "data format", and thinks of csv/ssv/tsv as one format with just a different separator setting, but humans think of csv/ssv/tsv as three different formats.]

Here is a patch that partially adds support for ssv: and tsv: prefixes:

--- a/hledger-lib/Hledger/Read/JournalReader.hs
+++ b/hledger-lib/Hledger/Read/JournalReader.hs
@@ -151,2 +151,4 @@ readerNames = map rFormat (readers'::[Reader IO])
 -- Find the reader named by @mformat@, if provided.
+-- ("ssv" and "tsv" are recognised as alternate names for the csv reader,
+-- which also handles those formats.)
 -- Or, if a file path is provided, find the first reader that handles
@@ -155,3 +157,3 @@ findReader :: MonadIO m => Maybe StorageFormat -> Maybe FilePath -> Maybe (Reade
 findReader Nothing Nothing     = Nothing
-findReader (Just fmt) _        = headMay [r | r <- readers', rFormat r == fmt]
+findReader (Just fmt) _        = headMay [r | r <- readers', let rname = rFormat r, rname == fmt || (rname=="csv" && fmt `elem` ["ssv","tsv"])]
 findReader Nothing (Just path) =
@@ -170,2 +172,7 @@ type PrefixedFilePath = FilePath
 -- split that off. Eg "csv:-" -> (Just "csv", "-").
+-- These reader prefixes can be used to force a specific reader,
+-- overriding the file extension. They are the readers' canonical names,
+-- not all the file extensions they recognise. But as a special case,
+-- "csv", "ssv" and "tsv" are all recognised as prefixes selecting the csv reader,
+-- since it handles all three of those formats.
 splitReaderPrefix :: PrefixedFilePath -> (Maybe String, FilePath)
@@ -173,3 +180,3 @@ splitReaderPrefix f =
   headDef (Nothing, f) $
-  [(Just r, drop (length r + 1) f) | r <- readerNames, (r++":") `isPrefixOf` f]
+  [(Just r, drop (length r + 1) f) | r <- readerNames ++ ["ssv","tsv"], (r++":") `isPrefixOf` f]

But it needs to not just select the csv reader, but also somehow instruct it to expect SSV or TSV, ie adjust the separator. See https://hledger.org/1.32/hledger.html#separator , which also seems to imply that ssv: and tsv: already work, wrongly I think.

Help welcome!

simonmichael avatar Feb 05 '24 23:02 simonmichael

I mentioned this over in Matrix too, but just so it's here, I'm going to try diving into this and aim to have something put together within the next few days!

reesmichael1 avatar Feb 07 '24 15:02 reesmichael1