Connection Configuration Improvements
Citus has long lacked a facility for specifying much of anything other than a host and port name for worker machines. While this simple <host, port> approach works for the simplest of cases, when more complex addressing systems, SSL configuration, authentication schemes/rotations, special routing for connection pools vs not, and non-trivial host name generation requires effort from end-users to implement, resulting in storage of these concerns in a manner external to Citus and often brittle.
For instance, deployments needing passwords must write software to deploy and maintain pgpass files in specific locations on the filesystem. If they become out-of-sync with what is required by the remote nodes, connections establishment will fail. Debugging this can be difficult, given the need to inspect the pgpass files to determine what is wrong (or to verify that they are even being picked up at all). This is all without even considering the risk of storing passwords in plain text on a filesystem.
Proposal Details
To address these needs, we propose extending Citus' node metadata to allow arbitrary libpq connection parameters on a per-node basis. To avoid having to use libpq's query string parsing functions (and to use better structure, generally) we plan to use the JSONB type for this column.
Additionally, we will extend the dictionary of libpq parameters with some Citus-specific ones to address certain needs (i.e. addressing a node directly or through a connection pool such as pgbouncer). Further parameters may be added as common pain points continue to be identified.
Because of the sensitivity of authentication information, this new column must not contain passwords. Besides, typical deployments assign passwords on a <user, host>-unique basis, not simply by host. Due to this, a superuser-only table will be added to store such information. Rows will reference an existing pg_dist_node row and add user and password information. In keeping with the spirit of existing pgpass functionality, * will be accepted in the user field as a wildcard match for all users (a similar accommodation could be made to allow a given user to have the same password on all hosts).
Finally, we have come across deployments where the host and user name used during connection establishment do not match the simple values available at runtime (i.e. the user name differs from PostgreSQL's internal GetUserId). We have yet to find a case where a simple templating mechanism could not address these concerns. As such, fields where the value might be a host or user name will accept a simple templating system in which %%HOSTNAME and %%USERNAME%% will be replaced by the worker node name (as specified in pg_dist_node) and current user name (as calculated by our existing CurrentUserName method).
Implementation Concerns
The question arose: do we replicate all of this new information to schema nodes in an MX deployment? We've decided for now to only replicate the non-sensitive connection options, requiring users to manually propagate their password information. If this is too unexpected, we can always simply change to propagate both sets of information.
Unrelatedly, within Citus we do a lot of work to ensure connections can be looked up within a cache to mitigate the impact of connection establishment time on performance. At present, connections are looked up by <host, port, user>, but this will be insufficient if we start differentiating between pooler and direct connections to a given host. Simply adding more fields to the key should be sufficient.
Tasks
We expect this work to take longer than one week to complete, but less than a full two. How familiar the implementor is with the existing metadata/connection access/caching may affect implementation time in one way or the other.
- Add
connoptscolumn topg_dist_node— Schema change and access logic. Ensure upgrade path from existing deployments and add logic to ban certain keys (such as password). Two days - Add superuser-only auth table — Begin to use new parameters and authentication in connection establishment. Understand wildcarding and suggest migration for
pgpassusers. Two days - Implement mini-templating system — Test with exotic configurations where remote user/host names do not match local ones. Add extensive testing for various uses. Two days
- Augment existing cache code — Verify that cache is properly used for metadata and connection access. Check invalidation logic. Make sure there isn't improper aliasing. Three days
- Explain to documentation — Write up summary of usage to help documentation of the new capabilities and changes. One-half day
Stretch/Later
Some additional functionality can wait for later.
- Allow separate
pgbouncer/direct modes — Have to figure out which operations (COPY, rebalance, replication, even DDL?, etc.) need direct access to nodes and which can go throughpgbouncer. Add new optional param to specify how to access pooler separate from direct connections - Docker modifications — Might be nice to update
docker-composeto use this new functionality (generate a random password at startup and use that) rather than using non-authenticated connections. - Usability improvements (Enterprise-only) — Some UDFs could help the ergonomics of adding a new user and their associated password to all nodes, etc.
- Fix MX/rebalancer interaction issues — I believe some of these existing limitations could be removed if the pooler/direct parameter is available
- Start using this ourselves — See above
Gotchas
I believe existing tests should be able to function just fine without modification, but some more complex multi-user tests might require some changes to operate under the new system.
It's slightly vexatious to have another representation than libpqs, but maybe okay considering there are other non-libpq related options at stake and one representation may be better than two.
How's the progress on this one coming? Ran into this one, as I thought I could create another user and use that as a connection but I guess what didn't work. (or maybe I was doing something wrong)