Pool based async persister
This PR adds a way to instantiate the async postgres persister with a connection pool. The goal is to prevent a bug in parallelism: when the user implements async versions of map-reduce classes (i.e. set is_async to True), burr calls asyncio.gather instead of threads, meaning we can't reuse the connection of the async persister.
One solution is to update b_asyncpg to use a connection pool instead of a single connection and add a copy method to it, meaning that when burr cascades to get persisters for each sub-app, we get a new connection from the pool each time.
Changes
- Added a way to instantiate
AsyncPostgresPersisterwith a pool instead of a connection - Added a copy method that favors using the pool in copied instances if self.pool is not None
- Kept everything backwards compatible
- Also added missing async related docs
How I tested this
Manual testing
Notes
So far I only implemented a proposed solution in b_asyncpg, but the issue is still present in other async persister implementations. Once we agree on a definitive solution, I can update them as well.
Open question:
- Do we really want to keep things backwards compatible and let user instantiate with a simple connection? In most scenarios a pool is safer, especially in async context
Checklist
- [ ] PR has an informative and human-readable title (this will be pulled into the release notes)
- [ ] Changes are limited to a single goal (no scope creep)
- [ ] Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
- [ ] Any change in functionality is tested
- [ ] New functions are documented (with a description, list of inputs, and expected output)
- [ ] Placeholder code is flagged / future TODOs are captured in comments
- [ ] Project documentation has been updated if adding/changing functionality.
[!IMPORTANT] Add connection pool support to
AsyncPostgreSQLPersisterfor better parallelism handling in async contexts.
- Behavior:
AsyncPostgreSQLPersistercan now be instantiated with a connection pool usingfrom_values()withuse_pool=True.- Adds
copy()method toAsyncPostgreSQLPersisterto create new instances using the pool.- Updates
_get_connection()and_release_connection()to handle pooled connections.- Documentation:
- Updates
actions.rst,parallelism.rst, andsync-vs-async.rstto include information about using connection pools with async persisters.- Misc:
- Maintains backward compatibility with direct connections.
- Fixes minor formatting issues in
b_asyncpg.py.This description was created by
for c4d6293bb3c9dadbd9ea387fab6e5b1687167aaf. You can customize this summary. It will automatically update as commits are pushed.
I pushed fixes for the pre-commit hooks. I'd still like to know your opinion on the following @elijahbenizzy : I have kept things backwards compatible, but should we really let people instantiate async persisters with direct connections? This means people that use this will still run into the issue when using parallelism. In async setting, I would argue it'd be better to "force" connection pools to be used
Regarding tests, there are already some in the test folder that I can draw inspiration from, but if you have something specific in mind lmk
I pushed fixes for the pre-commit hooks. I'd still like to know your opinion on the following @elijahbenizzy : I have kept things backwards compatible, but should we really let people instantiate async persisters with direct connections? This means people that use this will still run into the issue when using parallelism. In async setting, I would argue it'd be better to "force" connection pools to be used
Regarding tests, there are already some in the test folder that I can draw inspiration from, but if you have something specific in mind lmk
@gamarin2 Ok, I'm not sure this is "backwards compatible" as much as "preserves prior behavior". E.G. will anyone's code break if this is released and the default is set to True rather than False for use_pool?
I don't think so, although you have a better sense.
If not: let's set the default to the optimal behavior and save people a headache.
If so: let's think more carefully about this -- we're still in 0.X.X so we have the right, but we want to be smart about this.
@jernejfrank -- any thoughts?
Mergine despite build failures @jernejfrank has agreed to fix!