piccolo icon indicating copy to clipboard operation
piccolo copied to clipboard

Atomic transaction?

Open powellnorma opened this issue 2 years ago • 6 comments

With code like this:

async with DB.transaction():
    await Band.insert(Band(name="test1"))
    assert False
    await Band.insert(Band(name="test2"))

Band "test1" is still inserted. Is there a way to make the transaction "really atomic"? And what does the current transaction mechanism do, e.g. what is the difference to not using a transaction (in piccolo)? Thank you!

powellnorma avatar Jul 18 '23 18:07 powellnorma

In that example, none of the queries should get committed. Is this Postgres or SQLite?

If you give this a go in the playground (piccolo playground run):

async with Band._meta.db.transaction():
    await Band.insert(Band(name='test'))
    raise Exception()

It doesn't get committed - check using this:

await Band.select()

Any exception raised within the async context manager should stop the transaction.

Things to note though:

async with Band._meta.db.transaction():
    await Band.insert(Band(name='test'))
    # This query sees the new Band, even though it's not committed:
    print(await Band.select())

# This won't see the new Band, as the transaction has been rolled back.
print(await Band.select())

Something else to bear in mind is maybe there are nested async context managers, which is causing weirdness.

async with DB.transaction():
    async with DB.transaction():
        ...

Other than that I'm not sure - maybe a strange bug somehow.

dantownsend avatar Jul 18 '23 19:07 dantownsend

Hm, when I use Band._meta.db.transaction it is indeed rolled back. But when I just use DB.transaction() it does not. I initialized DB with something like this:

DB = PostgresEngine(config={
    ...
})

I thought Band._meta.db and DB would be the same object, but I now see DB is Band._meta._db is False. What is the difference? And can you reproduce this?

powellnorma avatar Jul 18 '23 20:07 powellnorma

Yes, Band._meta.db and DB should be the same object.

>>> from piccolo_conf import DB
>>> MyTable._meta.db is DB
True

If they're not, then that explains why the transactions aren't getting rolled back.

You can check the config for each one to see what they're pointing to:

>>> from piccolo_conf import DB
>>> DB.config
{'database': 'my_app', ...}
>>> MyTable._meta.db.config
{'database': 'my_app', ...}

dantownsend avatar Jul 18 '23 20:07 dantownsend

They both point to the same config, but they are somehow still not the same object. I can reproduce with code like the following:

from piccolo.columns.column_types import Varchar
from piccolo.engine.postgres import PostgresEngine
from piccolo.table import Table
from starlette.applications import Starlette
import uvicorn

import os
os.environ['PICCOLO_CONF'] = 'app'

DB = PostgresEngine(config={
    # ..
})

class Band(Table):
    name = Varchar(unique=True)

async def do_db_stuff():
    print(Band._meta._db.config)
    print(DB.config)
    assert DB is Band._meta._db  # throws

    async with DB.transaction():
        await Band.insert(Band(name="test1"))
        assert False
        await Band.insert(Band(name="test2"))

async def on_startup():
    await do_db_stuff()
    os._exit(0)

app = Starlette(debug=True, on_startup=[on_startup])

if __name__ == '__main__':
    Band.alter().drop_table(if_exists=True).run_sync()
    Band.create_table(if_not_exists=True).run_sync()
    uvicorn.run(app, host='127.0.0.1', port=64216)

powellnorma avatar Jul 19 '23 08:07 powellnorma

I gave this a go, and you're right - in that situation DB and Band._meta.db are different objects, but they have the same configuration.

I think the problem is that Band uses engine_finder to get the engine, and since we've set the environment variable PICCOLO_CONF=app if effectively does from app import DB. Usually Python realises that this has already been imported, but I think it fails here, as what's already been imported is __main__.DB, so it imports it again.

So it's some weirdness with Python's import mechanics when using importlib.

Workaround

Move DB into piccolo_conf.py

If I move DB into a piccolo_conf.py file then it works as expected.

Bind DB directly to table

class Band(Table, db=DB):
    ...

Use __main__ as the environment variable

os.environ["PICCOLO_CONF"] = "__main__"

Fixes

In engine_finder we could check sys.argv[0] to see what the entrypoint file was. If the PICCOLO_CONF environment variable is the same, we either raise a warning, or swap it with __main__.

Here's where the import happens:

https://github.com/piccolo-orm/piccolo/blob/c8c7df571d37f7c880ccd14e2965b591db7570f2/piccolo/conf/apps.py#L374

What do you think?

dantownsend avatar Jul 19 '23 13:07 dantownsend

In engine_finder we could check sys.argv[0] to see what the entrypoint file was. If the PICCOLO_CONF environment variable is the same, we either raise a warning, or swap it with main.

If there is no ambiguity, I'd prefer swapping it instead of raising an error/warning.

Maybe it would be possible to assert that only one DB Instance (with a particular config) got initiated? That way we could catch such "Module got instantiated multiple times" Bugs. Perhaps by hashing the config, and storing it in a piccolo internal set?

Hm, maybe raising an warning when .transaction() gets called, but has no effect, is also appropriate?

powellnorma avatar Jul 20 '23 07:07 powellnorma