ibis icon indicating copy to clipboard operation
ibis copied to clipboard

feat(caching): tie lifetime of cached tables to python refs

Open coroa opened this issue 1 year ago • 2 comments

Description of changes

Makes use of weakref.finalizers to release the cached table automatically when the cached table (or more precisely its .op()) is garbage collected.

In [1]: import pandas as pd

In [2]: import ibis

In [3]: con = ibis.duckdb.connect()

In [4]: df = pd.DataFrame([[0, 1], [2, 3]], columns=["foo", "bar"])

In [5]: noncached = con.create_table("tab", df).mutate(foo=2 * tab.foo)

In [6]: cached = noncached.cache()

In [7]: con.tables
Out[7]: 
Tables
------
- ibis_cache_wvrjoptclzhudglqapz4zusnlu
- tab

In [8]: del cached

In [9]: con.tables
Out[9]: 
Tables
------
- tab

If I understand the nodal expression tree structure correctly, then cached.op() is preserved in all dependent expressions. This PR adds a weakref.finalize(cached.op(), ...) (docs) onto cached.op() which releases the physical cached copy from the databases.

The idempotency of .cache() is also preserved by returning the same table reference.

coroa avatar Jun 29 '24 19:06 coroa

ACTION NEEDED

Ibis follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message.

Please update your PR title and description to match the specification.

github-actions[bot] avatar Jun 29 '24 19:06 github-actions[bot]

~The bidi dependency (for a bidirectional dictionary) is not necessary anymore with the RefCountedCache implementation in this PR, but quite obviously my attempt at removing it was incomplete.~

coroa avatar Jun 29 '24 20:06 coroa

I will clean up the ruff lints in #9489!

cpcloud avatar Jul 01 '24 12:07 cpcloud

I will clean up the ruff lints in #9489!

Once that gets merged, i'll rebase ontop

coroa avatar Jul 01 '24 14:07 coroa

@coroa #9489 is merged!

cpcloud avatar Jul 01 '24 14:07 cpcloud

Ok, I rebased ontop of main and removed the DeprecationWarning and updated the .cache() doc-string with another sentence on the explicit and implicit cache eviction.

Only things left on my list of todos are:

  • [x] Release notes
  • [x] Check and update user guide

coroa avatar Jul 01 '24 14:07 coroa

Looks like there's maybe a bit of DeprecationWarning still hanging around?

cpcloud avatar Jul 01 '24 14:07 cpcloud

A release note will be generated from the commit message, so no need to do anything there.

cpcloud avatar Jul 01 '24 14:07 cpcloud

Looks like there's maybe a bit of DeprecationWarning still hanging around?

Ah yes, i forgot about all the tests i had to change :).

coroa avatar Jul 01 '24 14:07 coroa

Couldn't find any user-guide explanations of the cache'ing mechanism. So i guess the doc-string is the best we have atm. And that is up-to-date.

So, from my side this is good to go.

coroa avatar Jul 01 '24 17:07 coroa

Unsure what tripped the mssql backend. Unfortunately, have not been able to run those tests locally.

I think I'll need help to address the test failure.

coroa avatar Jul 02 '24 09:07 coroa

One transient error and one XPASS, which is a win!

cpcloud avatar Jul 02 '24 09:07 cpcloud

I'll fix this up and merge!

cpcloud avatar Jul 02 '24 09:07 cpcloud

Clouds are passing:

…/ibis on  self-releasing-cached-tables is 📦 v9.1.0 via 🐍 v3.10.14 via ❄️   impure (ibis-3.10.14-env)
❯ pytest -m 'bigquery or snowflake' -n 8 --dist loadgroup --snapshot-update -q
bringing up nodes...
.x.x...........................x....s................x..........s...........................x.................xx..........x..x............x..........................x...x........................... [  5%]
............x....................x....................s..................s................................................s......................................................s..x.x.......x...... [ 10%]
x.......x.x..x.........x....x..x.xx....x..........x....................x....................................................x...............................................sssssssssssssssssssss.... [ 16%]
xxxxxxxxxx.x.xx.x.x....x.x.....................................x.............x........x......xx.x.........................x.......x....x.x...........x...x.......................x................... [ 21%]
...x....x..........................x.x....................x........x.......x.x.......x..x......................x..........................xx...............x......x.x..x...x...x......x.............. [ 26%]
.....x.............................x....xx.......x......x..xxxx.xx.x.x.xxx..xx.x...xxxx...x..xx.xxx..x.x..xx...xx.x.x...x......xxx.xxxx.xxx.x...xx.xx.xxxxxx.xx...x.xx.......x.........x.....x....... [ 32%]
xx..........x............................x...x..x..................x....x.....x............x.....x................x..x.....xxx..x.x...xxx.xxxx.xx......xx.x...x..x.....x..............x.............. [ 37%]
..x.x.x.x.............x..x.....x..x.x.......x.x............................x....x..x.x..x...x..x.....x...x..............x...xxxxx....x..................x.........x.......xx..x.x.................... [ 42%]
.....s.............x..............s.x...x....x..............sx..xx.s.x...x..x.......s......x.x......x..........x...................x...........x...........................x..................xx..... [ 48%]
.x.........x.....................x...........................s..........................s....................s..........x................................x..............x.x...x..x...........x...x... [ 53%]
..............xxx.....xx......................x.xx..x......x.x..x...x.x......x............x.x..x.......x..................x.......xx..xx..xx...xxxxxxxxxx...xxx...xxxx.xxxx..x...x.x..x..x.xxx.xx..x. [ 58%]
xxxxx..x.x.x...x..................x..x...x...x....x........x........x.....x.x....x.............x.x.x...xx.....x....x...x.......x...xxxxx................x................x..xx....xx................. [ 64%]
....x...x....x................x.............x.x.....x........x.x.x....................x..........x..xx.....x.......x.x...........x..xx.......x..x................x.....x......xxx...x.x.xxx..xxxxxxxx [ 69%]
xxxx.xxxxx.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx..x......................x.........................x.....................x...x........ [ 74%]
....................................................................x.xxxxxx.xx.x....xx.xx....x..x.......x....................x..............x........x.x.xx...x.x...x.....x....x...xx..............x [ 80%]
..xx....x...xx..x..x.....x.....................x..x.................x.x.....x.......x...........x........x....x..x.x.......x.......x................................................................. [ 85%]
..........................................................................................................................................................................s.......................... [ 90%]
.........x.............................................................................................................................................................................x.x........... [ 96%]
......................................................s...ss......................................x...........................................                                                        [100%]
3097 passed, 39 skipped, 552 xfailed in 840.67s (0:14:00)

cpcloud avatar Jul 02 '24 10:07 cpcloud

@coroa Thanks for pushing this through, great work!

cpcloud avatar Jul 02 '24 10:07 cpcloud

Thanks for the quick reviews and comments!

coroa avatar Jul 02 '24 11:07 coroa