beanquery icon indicating copy to clipboard operation
beanquery copied to clipboard

Migration instructions

Open tarioch opened this issue 1 year ago • 12 comments

Are there any migration instructions anywhere?

I see https://github.com/beancount/beanquery/blob/master/CHANGES.rst which describes some of the changes but trying to convert my code from beancount 2 over to beanquery is right now more a trial and error with some looking through the source.

e.g. I used to do

value(position, #"2020-12-31")

and I think now I need to do

value(position, date("2020-12-31"))

As I'm embedding this into my own code, there are also changes with the returned data on the run_query, before the rows were named tuples, now I think they only contain the data and I have to get the names out of the types, correct?

tarioch avatar Jun 21 '24 11:06 tarioch

change column was dropped, I think it's now only position, right?

tarioch avatar Jun 21 '24 11:06 tarioch

The released version is 0.1.0.dev0 ie a development snapshot. Don't expect the documentation to be complete. I am planning to summarize the changes into the CHANGES.rst document and maybe write another document explicitly describing the migration path, however I haven't had time to do that yet. That said, the changes that are not backward compatible are very few, and you probably found them all already.

The #"YYYY-MM-DD" syntax for dates have been removed. You can use the more concise YYYY-MM-DD syntax. You don't need to use the explicit cast via the date() function.

The change column has been removed. AFAIK the documented name for that field has been position for a long time. I thought no one used the old name. If it is a major pain to migrate to the new name, I am open to reinstate the alias.

Yes, the return type of run_query() changed from named tuples to simple tuples. Indeed, the column names are in the columns descriptor also returned. AFAIK run_query() has never been a documented API, thus I felt free to change it in a backward incompatible way. Also in this case, if the breaking change is too much of a pain, I can reinstate backward compatibility. The change was mostly done for efficiency in returning the results, but since then beanquery itself migrated to a better API, thus putting the named tuples back into the data returned by run_query() is should not do any harm.

dnicolodi avatar Jun 21 '24 13:06 dnicolodi

I think right now I'm ok with the changes, I can adjust code on my side without too much troubles.

Some more cases which I encountered and the solutions for it (I'll add some more comments on this ticket for other cases I convert while changing my codebase, might be useful for others).

select distinct currency where currency != "CHF" and getitem(commodity_meta(currency), "type") != "invest"

didn't work anymore for undefined meta entries, but you can write it like this

select distinct currency where currency != "CHF" and getitem(commodity_meta(currency), "type", "other") != "invest"

similar

instead of

        select
            meta('project') as name,
            sum(number) as number
        where
            meta('project') != null
        group by meta('project')
        order by meta('project')

you need to write

        select
            meta('project') as name,
            sum(number) as number
        where
            str(meta('project')) != ""
        group by meta('project')
        order by meta('project')

tarioch avatar Jun 21 '24 15:06 tarioch

Created #192 for an issue where I didn't find a way to rewrite the query (location with pad directive)

tarioch avatar Jun 21 '24 15:06 tarioch

select distinct currency where currency != "CHF" and getitem(commodity_meta(currency), "type") != "invest"

didn't work anymore for undefined meta entries, but you can write it like this

select distinct currency where currency != "CHF" and getitem(commodity_meta(currency), "type", "other") != "invest

I don't understand how the first version of this query fails. Maybe I don't get what you mean with "undefined meta entries". If the commodity has no metadata associated, commodity_meta(currency) return NULL and getitemt(NULL, "type") is still NULL, which is different from invest, thus the condition is false. I don't understand what else you expect.

dnicolodi avatar Jun 21 '24 16:06 dnicolodi

meta('project') != null

This is spelled meta('project') IS NOT NULL now.

The fact that the != raises an error is a bug.

dnicolodi avatar Jun 21 '24 16:06 dnicolodi

select distinct currency where currency != "CHF" and getitem(commodity_meta(currency), "type") != "invest"

didn't work anymore for undefined meta entries, but you can write it like this

select distinct currency where currency != "CHF" and getitem(commodity_meta(currency), "type", "other") != "invest

I don't understand how the first version of this query fails. Maybe I don't get what you mean with "undefined meta entries". If the commodity has no metadata associated, commodity_meta(currency) return NULL and getitemt(NULL, "type") is still NULL, which is different from invest, thus the condition is false. I don't understand what else you expect.

That is what I was expecting. But those are actually missing, so it seems null != "invest" is somehow evaluated to true.

tarioch avatar Jun 21 '24 16:06 tarioch

It works as expected here:

plugin "beancount.plugins.auto_accounts"

2024-01-01 commodity AAA
  type: "baz"

2024-01-01 commodity BBB
  type: "invest"

2024-01-01 commodity CCC

2024-06-21 * "Test"
  Assets:Foo                                               1 AAA
  Assets:Foo                                               2 BBB
  Assets:Foo                                               3 CCC
  Equity:Opening-Balances

9999-01-01 query "test" "
  SELECT
  DISTINCT
    currency
  WHERE
    currency != 'CHF' AND
    getitem(commodity_meta(currency), 'type') != 'invest'
  "
beanquery> .run test
currency
────────
AAA

However, I think that what you want is NULL != 'invest' to return true, not false. This is a very common SQL pitfall: all comparisons with NULL return NULL, which is false. If you want to get all the commodities for which the invest metadata field is not "invest" or is not set, you need to write it as NOT commodity_meta(currency)['type'] = 'invest':

beanquery> SELECT DISTINCT currency WHERE currency != 'CHF' AND NOT commodity_meta(currency)['type'] = 'invest'
currency
────────
AAA
CCC

dnicolodi avatar Jun 21 '24 17:06 dnicolodi

Yes that's exactly the difference in behavior to what it was before. I think the new behavior is more in line with other SQL cases and therefore makes sense.

tarioch avatar Jun 21 '24 17:06 tarioch

Yes, handling of NULL values accordingly to the SQL standard is an intentional change in behavior.

dnicolodi avatar Jun 21 '24 17:06 dnicolodi

I had a couple similar cases, basically before non existing meta was considered as "" and now it's null. I think null actually makes more sense, so the new behavior is good and cleaner, just something to be aware of when migrating.

I think I'm now through with the migration of the codebase and it was actually quite simple, to summarize the points I had to do

  • missing meta now results in null instead of empty string
  • #"YYYY-MM-DD" is now written as YYYY-MM-DD
  • rows returned by run_query are no longer named tuples
  • don't use (deprecated) change column, use position instead

I think all the changes make sense and are leading to cleaner code.

tarioch avatar Jun 21 '24 17:06 tarioch

Thanks for your comments and for being an early adopter!

  • #"YYYY-MM-DD" is now written as YYYY-MM-DD

The new syntax is supported by bean-query in Beancount v2 too. The change is the removal of the # prefixed syntax.

dnicolodi avatar Jun 21 '24 18:06 dnicolodi