Support psycopg3 RFC
This issue has been migrated from #14586.
Description:
The latest version of psycopg is psycopg3; it supports asyncio and also works in pure python for better pypy compatibility.
In a django app that I develop, it's almost 5% faster with more optimizations coming. I'm also a synapse user and would be interested in trying to port synapse to psycopg3 for the improved performance.
- Would such a PR be of interest to the synapse project?
- If so, could this be a complete migration to psycopg3, or would you want the codebase to support both 2 & 3?
- If it's critical to retain the support if psycopgcffi, then both would need to be supported.
- (Given the pure-python mode, perhaps pypy performance on psycopg3 would be equivalent to psycopgcffi?)
Any thoughts on this proposal? It might be biting off more than I can chew to do a migration, but I'd like to try if the synapse maintainers are open to it.
@realtyem and I were looking at this again from the old branch I had... I now have a branch which passes all unit tests using psycopg3. It isn't massive, but is pretty big. I do not know if it has any performance impact yet. (And it has a few lints to fix.)
From my POV it would be reasonable to propose this in a few stages:
- Support both psycopg2 & psycopg as drivers for postgres, this works by adding another "engine" and some if-defs.
- After ensuring the above works OK, drop support for psycopg2.
- Eventually, rework some of the innards of
synapse.storage.databaseto be async-native, bypassing the Twisted database threadpool when using an async-compatible database driver. - (Maybe?) Replace SQLite support with an async driver.
I know @erikjohnston at some pointed mentioned trying to by-pass the Python database driver completely, but I have no idea if there's any plan to do this.
@clokep that sounds like a decent plan.
I have been investigating what a Rust based DB pool would look like, but it's sufficiently far away that I don't think we should block upgrading from psycopg2. If/when we change the internal database APIs to be async-native we should just try and make sure that it won't make replacing it with a Rust db pool harder.
(For others following: the reason for wanting a Rust DB pool is to allow Rust modules to access the DB without having to go back into Python).
Some work towards this:
- matrix-org/synapse#16618
- matrix-org/synapse#16615
- matrix-org/sytest#1390
Thanks for implementing step 1! It does make me wonder how exactly we want to migrate users across. I think it makes sense to initially allow homeservers to opt-in to using the new driver (for testing purposes), but ideally at some point we'd make it the default in some Synapse version. The plan though, as written, seems to suggest that we'd require homeserver admins to manually change the config to use the new driver? I'm wondering if that is necessary, or we should rename it "postgres" and have the exact driver be specified via a different config option (as most homeservers won't care)
I'm wondering if that is necessary, or we should rename it "postgres" and have the exact driver be specified via a different config option (as most homeservers won't care)
You mean something like:
database:
name: postgres # allow psycopg2 for backwards-compat
driver: psycopg. # is undefined, fallback to psycopg2, eventually default to psycopg in the future
...
That might be more elegant, yeah. I think we were trying to avoid being too invasive, but that should be doable. @realtyem any thoughts?
I'm not against it. But it comes down to just delaying the inevitable. At some point existing admins will have to touch the configuration.
- Consider the
sqlite/sqlite3angle, that should probably match. - Consider the future, that the connection pooling settings may need to be changed.
- Will have to touch sytest again
- Prior art for when it's time to deprecate a setting, in both circumstances:
- backwards compatible for perpetuity
- force making the change by prohibiting start up
- The confusing aspect of vastly similar names(
psycopg2vspsycopgwhich is actually psycopg3)
Even with all the things I can think of, I do believe this is a worthwhile future-proofing that includes the ability to have a choice. Do we want this as part of this existing work, or a separate follow up?
Why would a they have to touch it in the future? I think the idea is:
- You add it, default to psycopg2 but support psycopg
- Eventually you change default to psycopg but allow psycopg2
- Remove it, forcing psycopg
I guess I see what you are talking about. Even if it's left at today's standard of name: psycopg2 will be just be psycopg in the code, like an alias.