asyncpg icon indicating copy to clipboard operation
asyncpg copied to clipboard

feat: allow connection with pre-configured socket

Open jackwotherspoon opened this issue 1 year ago • 6 comments

I'd like to add support for creating a postgres connection over an existing socket like object. In this approach startTLS can or cannot be used depending on the caller. This lines up similarly with what we have in Java (socketFactory connection param, driver code), Go (pgx DialFunc) (these two are slightly different as they allow specifying a creator/generator func that creates the socket, while here we could just pass in the socket directly but the extent of the feat is the same), and other Python libraries (pymysql, and pg8000)

The equivalent pymysql PR that introduced this change explains the feature really well https://github.com/PyMySQL/PyMySQL/pull/355

The benefit of this feature is that it allows the user to specify their own secure tunnel to connect over (such as ssh).

Is this sort of feat possible for asyncpg?

jackwotherspoon avatar Jul 27 '23 15:07 jackwotherspoon

Sure. A socket factory callback to connect() would likely be the cleanest and most straightforward approach, though there are likely asyncio-imposed requirements on what the returned socket can be (at the very least it should support non-blocking I/O and be compatible with epoll).

elprans avatar Jul 27 '23 16:07 elprans

@elprans thanks for the quick response!

Any ideas on where I should look to get started on this? Any tips or further suggestions would be greatly appreciated 😄

jackwotherspoon avatar Jul 27 '23 16:07 jackwotherspoon

Sure, you need to pass the callback all the way to __connect_addr and then pass the result of calling it to loop.create_connection() as the sock argument.

elprans avatar Jul 27 '23 16:07 elprans

Hi @elprans just wondering if I could pick your brain as I've started attempting to implement this feature.

So in __connect_addr if I understand correctly it would look like this:

elif params.socket_callback:
    # if socket factory callback is given, create socket and use
    # for connection
    sock = await params.socket_callback()
    connector = loop.create_connection(proto_factory, sock=sock)

I'm trying to see if that would work for the following callback...

def sock_func(host: str) -> socket.socket:
    return socket.create_connection((host, SERVER_PROXY_PORT))

async def main():
    host = "X.X.X.X"
    async def async_sock_func():
        return await asyncio.to_thread(sock_func, host)
    
    return await asyncpg.connect(
        user=user,
        database=db,
        password=passwd,
        socket_callback=async_sock_func,
        **kwargs,
    )

Let me know your thoughts, looking forward to hearing them 😄

jackwotherspoon avatar Nov 30 '23 21:11 jackwotherspoon

@elprans I linked a WIP PR for our use-case that seems to be working with my PR branch build 😄

jackwotherspoon avatar Dec 04 '23 15:12 jackwotherspoon

How about exposing a connector factory, such that callers would have more control over how the socket was created?

For example, in __connect_addr, we'd add:

elif params.connector_factory:
    connector = params.connector_factory(proto_factory, *addr, loop=loop, ssl=params.ssl)

This would allows full customization of the socket and how it was created (e.g., creating an SSH tunnel, doing reads and writes to the socket prior to the Postgres protocol when a proxy sits in front of the database, etc).

enocom avatar Dec 22 '23 00:12 enocom