hledger icon indicating copy to clipboard operation
hledger copied to clipboard

boolean query expressions for more flexible queries, OR support

Open simonmichael opened this issue 11 years ago • 13 comments

From the list:

Is there a way to get hledger bal to show balances of transactions that have either TAG1 or TAG2? The docs seem to say that there isn't, but It's not clear to me.

Hi.. that's a good question. As you point out, our queries implementation means

hledger print tag:tag1 tag:tag2

only matches transactions with both tags. Most query terms take a regular expression, so I tried

hledger print tag:'(tag1|tag2')

but no, the argument of tag: is a literal tag name, not a regexp. So the surprising answer is no, it's currently not possible. Someone needs to

  1. make tag: take a regexp (matchesPosting and matchesTransaction in hledger-lib:Hledger.Query)
  2. add support for full boolean query expressions


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

simonmichael avatar Sep 05 '14 17:09 simonmichael

I needed this, but looks like at least 1. already works. This commit seems to be it:

https://github.com/simonmichael/hledger/commit/fdc507bd1365a98a145f94a48ce7c887b1c1dc7c

ascjones avatar Oct 02 '16 20:10 ascjones

I'd just like to be able to query transactions that post both to account A and account B, which in the usual case would represent a movement of funds from A to B or B to A. This ability would help keep the account hierarchy manageable in many cases. For instance, I get paid a salary, some of which goes to a retirement account and some of which is deposited in a checking account. If I would like to see what my "take home" pay was for a given period I should be able to query like hledger balance Salary AND Checking or whatever syntax would be most error-proof. Otherwise I would resort to using Salary:Take-home, Salary:DeferredComp, Salary:Taxes and so on, even though there's a one-to-one relationship between these categories and another account.

If this were a Python project and not a Haskell project, I would add this functionality myself... Haskell seems cool though.

hoclun-rigsep avatar Mar 02 '18 15:03 hoclun-rigsep

The usual way is with a pipe, something like: `hledger print salary | hledger -f - balance'

simonmichael avatar Mar 02 '18 15:03 simonmichael

Ha! Years on the shell and this did not occur to me, I guess it didn't occur to me that the output of print is valid journal. Thanks.

hoclun-rigsep avatar Mar 02 '18 16:03 hoclun-rigsep

Yep this is quite a common pattern with hledger for more powerful filtering. Probably it should be mentioned in the manual (under QUERIES ?)

simonmichael avatar Mar 02 '18 16:03 simonmichael

The usual way is with a pipe, something like: hledger print salary | hledger -f - balance

This will not work if the hledger print salary outputs any transactions involving balance assertions, since the balance assertions will not, umm, balance.

the-solipsist avatar May 28 '19 13:05 the-solipsist

In that situation I add -I.

simonmichael avatar May 28 '19 14:05 simonmichael

Thanks for that! Didn't realize that flag existed. Perhaps the issue title could be changed to "Boolean Search Operators for More Flexible Queries"?

the-solipsist avatar May 28 '19 15:05 the-solipsist

amigalemming commented 3 days ago:

In #1246 Simon proposes "real boolean queries". This ticket may discuss the details. Shall boolean queries try to be compatible with the current query syntax? Which features shall be maintained, which ones could be dropped? Shall boolean queries replace current syntax? Shall it be the default e.g. in hledger-web?

Here an proposal: Allow space-separated lists of query terms. Equal fields are OR'd and diferent groups of equal fields are AND'd. This is done consistently for all fields, e.g. including date. These default combinations can overridden by the operators OR, AND, NOT and parentheses. In current syntax OR, AND, NOT would be interpreted as account names and parentheses are interpreted as regular expressions for account names. I propose to drop the feature that account names can specified without the acct prefix. This should resolve all ambiguities.

simonmichael avatar Aug 04 '20 16:08 simonmichael

What's needed here: I believe megaparsec used to provide special support for parsing "boolean expressions", meaning that implementing this would only require a small amount of coding in hledger. I think now perhaps it reexports something similar from Control.Monad.Combinators.Expr. We need someone to build a protoype which uses this to allow arbitrarily complex queries involving and, or, not, and parentheses for grouping, so we can test and figure out usability issues. Ledger's CLI could be used for comparative testing and ideas - I do believe it supports these complex queries, though if it's documented anywhere I can't find it, only this.

[megaparsec 7: Dropped the Text.Megaparsec.Expr module. Use Control.Monad.Combinators.Expr from parser-combinators instead.]

simonmichael avatar Aug 10 '20 15:08 simonmichael

Yes, Ledger queries are better documented in --help. Here's that for reference:

QUERIES
     The syntax for reporting queries can get somewhat complex.  It is a series of query
     terms with an implicit OR operator between them.  The following terms are accepted:

     regex          A bare string is taken as a regular expression matching the full
                    account name.  Thus, to report the current balance for all assets and
                    liabilities, you would use:

                          ledger bal asset liab

     payee regex (@regex)
                    Query on the payee, rather than the account.

     tag regex (%regex)
                    Query on tags.

     note regex (=regex)
                    Query on anything found in an item's note.

     code regex (#regex)
                    Query on the xact's optional code (which can be any string the user
                    wishes).

     term and term  Query terms are joined by an implicit OR operator.  You can change
                    this to AND by using the and keyword.  For example, to show food
                    expenditures occurring at Shakee's Pizza, you could say:

                          ledger reg food and @Shakee

     term or term   When you wish to be more explicit, use the OR operator.

     show

     not term       Reverse the logical meaning of the following term.  This can be used
                    with parentheses to great effect:

                          ledger reg food and @Shakee and not dining

     ( term )       If you wish to mix OR and AND operators, it is often helpful to sur-
                    round logical units with parentheses.  NOTE: Because of the way some
                    shells interpret parentheses, you should always escape them:

                          ledger bal \( assets or liab \) and not food

simonmichael avatar Aug 10 '20 15:08 simonmichael

If anyone is interested, here is a workaround I use.

isabekov avatar Jan 01 '22 19:01 isabekov

@isabekov lovely blog post, thanks!

I will try to work through it myself. I'm finding the "Let's say you want to display all transactions that fulfill the following conditions" section a little unclear, perhaps we could discuss in #hledger chat some time.

simonmichael avatar Jan 01 '22 19:01 simonmichael

I made an attempt at writing the code for this issue in #1989. Please let me know what you think.

I looked into Control.Monad.Combinators.Expr, but needed to have the possibility of parsing lists of queries at a time, which is perhaps possible with this, but seems more complicated to me than writing parsers by hand. The resulting parser still obeys precedence rules for operators, however.

chrislemaire avatar Jan 23 '23 20:01 chrislemaire

It's beside the point, but I forgot to mention here this workaround, complementing @isabekov's: to emulate logical AND of two terms on the same transaction part with current hledger, you can use not:not:. Eg, to print expenses:fuel purchases, but only the ones paid from the pchecking account:

hledger print expenses:fuel not:not:pchecking

simonmichael avatar Jan 24 '23 02:01 simonmichael

@chrislemaire's #1989 has landed in master, implementing this feature. Eg:

hledger reg not:expr:'desc:foo AND amt:0'`

Hurrah! More extensive testing and feedback welcome. https://hledger.org/dev/hledger.html#combining-query-terms

simonmichael avatar Mar 27 '23 21:03 simonmichael