hledger
hledger copied to clipboard
[WIP] tab-separated text as --output-format
I'd like to add a new output format, especially for the register
command: tab-separated text
Use-case:
I often use hledger to generate register reports which must be formatted nicely. For that, I use a text processing software (e.g. LibreOffice). To get a nice format, I have to adapt the font size, the output width (-w
) to match the correct page width and avoid line breaks. Also, only mono-space fonts are possible with the text format. To remove the account column, I have to apply an ugly sed script.
Suggestion: I suggest a tab-separated text format that puts tab characters between date, description, account, amount, total/average as column separater. The output could also directly pasted and used into table calculation software.
Alternatives considered: CSV output format would be a alternative for my use-case but it's much easier to do
hledger register -Otxt-tab-separated | xclip -i
… and then paste it directly into a LibreOffice Writer template.
Would maintainers accept such an PR?
We support reading csv, tsv and ssv, so I see no harm in adding (well-formed, standard-compliant) tsv as another output format. I would use that tsv
output format name for consistency.
Are we ok with it being available only on a few commands ? Supporting it consistently on all commands is a big effort.
That said, I bet there are utilities to easily convert csv to tsv.
tsv
sounds like csv
with tabs and for that I would use a converter tool, indeed.
My desired output format is actually only the output of plain hledger register
but with tabs instead of space-padding. I would not add any extra data columns like csv
. Therefore I'm not sure if tsv
is an expressive name for that.
I'm also good with adding that output format only to a few commands. It makes especially sense for register
, activity
, and maybe balance
and more.
Here is a csv to tsv converter:
$ cat csv2tsv
#!/usr/bin/env python
import csv, sys
csv.writer(sys.stdout, dialect='excel-tab').writerows(csv.reader(sys.stdin))
I'm a bit hesitant to add a whole new formatting option (with accompanying alignment annoyances) for such a simple change. Can you take a look at the output of the following and tell me if it meets your needs?
$ hledger register --output-format=csv | csv2tsv
Another alternative is just to use sed
, though I'm still not sure what the parameters you want for alignment are.
$ hledger register | sed -e "s/ */\t/g"
Thanks for the snippets, @Xitian9. For me, converting csv to tsv is not the problem.
For me it would be very useful to have normal register
output but with tabs instead of space-padded (see top post). I thought it might be useful for other people making printed/published reports with hledger, that's why I made this PR.
Edit: Sorry, didn't read carefully enough. Your second snippet actuall works quiet well for my sample register output
At least as long as:
- there are not more than 3 spaces in the description and
- the 3 spaces between the last two columns are guaranteed.
- Also, I don't know how the spaces between the columns changes for longer descriptions and accounts.
- Also, a tab character between the first two columns would be nice.
I'm not sure about these pre-conditions, that's why I think, a built-in solution might be better.
Here is a screenshot demonstrating the use-case:
The sed script only requires two spaces between each field, and that is guaranteed between all fields except between the date and the description. You are correct that this will choke when there are two or more consecutive spaces in a description.
Try this:
hledger register | sed -e "s/ /\t/" -e "s/ */\t/g"
I believe this addresses points 2 and 4. Point 1 remains an issue, though I'm not sure how often it would arise in practice. Point 3 is a general issue with tabs, and one of the reasons they are best avoided for actual text alignment: every piece of software handles them differently, and what works for one person will completely break another's workflow.
By the way, have you tried hledger-web? It sounds like it may fill the need of what you're trying to do.
By the way, have you tried hledger-web? It sounds like it may fill the need of what you're trying to do.
Thanks, but it doesn't provide the power of the command-line hledger:
- I cannot easily generate custom reports by script/Makefile. I would have to copy paste search queries and results by hand, right?
- I cannot use options like
--invert
which I need for some reports. - When copying from
/register
, it yields this, the total in the next line:
2021-11-08 fcdc6aec ce:e0:3b7c5f45, 83:e0:3b7c5f45 0
0
2021-11-06 c140d73e fa:53f9679b, ce:9ec9dcff 0
0
Thanks, I can probably use your second sed script when I have fixed occurrences of two consecutive spaces in the bank transaction titles.
I believe this addresses points 2 and 4. Point 1 remains an issue, though I'm not sure how often it would arise in practice. Point 3 is a general issue with tabs, and one of the reasons they are best avoided for actual text alignment: every piece of software handles them differently, and what works for one person will completely break another's workflow.
Aren't Point 1 and 3 a reason for implementing this right in hledger? I mean, all (GUI) text processing software knows the concept of tab stops. And text processing software is a good tool to use when making final reports. I use it for a few years now, for a 20-page report for a classic music festival.
Adressing specifically Point 3, my idea was to re-interprete the -w n,m
option for -O tsv
as description-max-width = n and account-max-width = m.
I agree with you overall, but I just can't shake the feeling that this is not quite the right way to go about it.
The way I see it, we have the following output types:
- plain text: for quick-and-dirty human-readable output and don't care about fancy formatting
- csv: as a data exchange format with other tools
- json: for serialising
- sql: for sql
My understanding is that you want tab-separated report output so that you can copy-paste it into libreoffice documents, and then format it nicely for print reports. This seems more like a workaround for a lack of a proper templating engine than an actual use case in itself. I just feel that there's a better solution lurking around here somewhere, and focussing on finding that will give better results than adding a new output format.
There might be something useful in here: https://unix.stackexchange.com/questions/170199/is-there-a-standalone-tool-which-will-write-reports-from-csv-data-files
I'm a little unsure as well, whether the proposed format is generally useful enough and generalizable enough to be worth adding as a baked in format. It might be, I just don't have enough experience with your use case to say yet..
@Xitian9 Regarding other tooling / template software: I already considered to use org-mode with babel code blocks and plain hledger output, and generate PDF reports via org-mode LaTeX export. But I find report formatting very cumbersome in LaTeX.
My impression is that a specialized report generation software for hledger is actually missing.
Thanks for the link, but that tool only converts one CSV file to one ODF. My report contains many different hledger reports.
I understand that both of you are hesitating. I just want to summerize the current options to make reports:
- plain text: for quick-and-dirty human-readable output and don't care about fancy formatting => not sufficient for proper reports. There are work-arounds (e.g.
sed
) that fail in some cases - csv: not very useful for making plain paper reports. Too many steps to get to reports: selecting columns, change separator. Too much programming/scripting or manual steps for amateurs.
- json: not at all useful for amateurs making reports
- sql: not at all useful for amateurs making reports
- hledger-web: limited: cannot do everything what hledger can do; cannot copy proper tab-separated register output to clipboard (see above)
So, how do people make real reports with hledger?
"Freaks" like us may do with org-mode, LaTeX, plain-text…
But shouldn't hledger also aim at users that make reports with normal text processing software like MS Word and LibreOffice? And then, how should they make reports? So far, I have no good way. Only work-arounds like replacing spaces with tabs using custom sed scripts or replacing 20 spaces with nothing to shorten the line, and fiddling around with the -w n,m
option. I think, a tab-separated text output could perfectly fit the gap.
But please let me know, how do you produce actual printable reports?
Great question, which I agree we want better answers for. It probably deserves its own issue, mail list / chat room discussion, and web page (maybe this one). I'll just add pandoc to that list, here's a related example (in essence: a script collects hledger data values, plugs them into a markdown template with envsubst, and pandoc renders pretty documents).
PS don't forget html
output.
PS don't forget
html
output.
(but not for register and balance)
Wow, you're right, we really should support these output formats more consistently.
Yes, maybe pasting an HTML table into LibreOffice might be an alternative to the current solution.
I've written a output filter script:
hledger-output-filter - Transform hledger's register and print output.
usage: hledger-output-filter [options]
options:
-t tab-separated instead of space-separated; useful for use in text
processing software with tabstops.
requirement: descriptions must not contain two consecutive whitespace!
-c omit all comments
-d DESCRIPTION_WIDTH
shorten description to n characters.
requires -t.
-a ACCOUNT_WIDTH
shorten account name to n characters.
requires -t.
-s a single date, i.e. "2021-11-19" instead of "2021-11-19=2021-11-20".
-h print help message.
- https://github.com/schoettl/hledger-contrib/blob/master/hledger-output-filter.sh
I find it useful but it's a hack. I had to fix dozens of double whitespace in my descriptions.
Nice! I often do the same to prototype a feature. On the upside, you cleaned up your data a bit..
The next level of robustness would be a hledger script or two, which can use hledger's parsers, and mimic the code in Register.hs/Print.hs. (hledger-register-pretty-tsv.hs
, hledger-print-pretty-tsv
or some such..)
(I'll try out your script. PS, nice https://github.com/schoettl/hledger-contrib repo, we should link it somewhere.)
That bash script looks very nice. I have been writing many recently in almost exactly that style, but I see some new things to learn from yours.
I'm not seeing much effect though. I don't notice any change to register or print output, eg from hledger -f examples/sample.journal print | hledger-output-filter.sh
. Adding -t gives
line 50: declare: -g: invalid option
declare: usage: declare [-afFirtx] [-p] [name[=value] ...]
because I'm on a mac, probably.
PPS:
-s a single date, i.e. "2021-11-19" instead of "2021-11-19=2021-11-20".
Yet another case of secondary dates getting in the way. I'm starting to really detest this feature!
I'm not seeing much effect though. I don't notice any change to register or print output, eg from
hledger -f examples/sample.journal print | hledger-output-filter.sh
Right, without options, the filter is id
and doesn't change anything.
line 50: declare: -g: invalid option declare: usage: declare [-afFirtx] [-p] [name[=value] ...]
declare -gr var=val
would be a global read-only variable but Mac has such an old version of Bash that this doesn't work. Maybe global
instead of declare
would work on Mac.
Yet another case of secondary dates getting in the way. I'm starting to really detest this feature!
Yes, I guess mostly the secondary date is not needed in register output. But in principle, it's a valuable information.
The next level of robustness would be a hledger script or two, which can use hledger's parsers, and mimic the code in Register.hs/Print.hs.
But hledger doesn't has a parser for register output, has it?
But hledger doesn't has a parser for register output, has it?
No, I meant the script can use hledger-lib to parse files just as the builtin commands do.
A useful discussion, that did not result in a PR; closing.