twenty
twenty copied to clipboard
Postgres proxy on cloud
Context
We have a multi tenant-cloud but use 1 schema per workspace. We could leverage this to our advantage to let people access their data externally (e.g. via Metabase or Tableau) We need to find a way to do that in a secure way (we cannot expose Postgres publicly/for all tenants)
Proposal
My initial proposal was to setup a new Postgres DB that we would use as a proxy with foreign data wrappers pointing to our read-replica. That way we would only expose the data for people that opt-in + we could add additional security layers (connection limit enforced by the read replica, read-only at prod db-level + read-replica level, additional logging tailored specifically for the proxy etc).
Then talking to @Freebios we realized that building a wrapper in Node wasn't actually complex. It feels a bit hacky as we're almost at the packet level but it seems to work reasonably well
Proof of concept
Here's a sample code that works (IP whitelisting part wasn't tested)
const net = require('net');
const HOST = 'localhost'; // PostgreSQL server host
const PG_PORT = 5432; // PostgreSQL server port
const PROXY_PORT = 5433; // Port for proxy server to listen on
// Create a server that listens for client connections
const server = net.createServer((clientSocket) => {
console.log('Client connected.');
const checkPassed = checkSecurity(clientSocket);
if (!checkPassed) {
console.log('Security check failed');
socket.end('Security check failed\n');
return;
}
// Connect to the PostgreSQL server
const serverSocket = net.connect({ host: HOST, port: PG_PORT }, () => {
console.log('Proxy connected to PostgreSQL server.');
});
// Relay data from client to PostgreSQL server
clientSocket.on('data', (data) => {
console.log('SQL Query:', data.toString()); // Log the SQL query
serverSocket.write(data);
});
// Relay data from PostgreSQL server back to client
serverSocket.on('data', (data) => {
console.log('PostgreSQL Response:', data.toString()); // Log the PostgreSQL response
clientSocket.write(data);
});
// Handle client disconnect
clientSocket.on('end', () => {
console.log('Client disconnected.');
serverSocket.end();
});
// Handle server disconnect
serverSocket.on('end', () => {
console.log('PostgreSQL server connection closed.');
clientSocket.end();
});
// Error handling for the client
clientSocket.on('error', (err) => {
console.log('Client Error:', err.message);
clientSocket.destroy();
serverSocket.end();
});
// Error handling for the server connection
serverSocket.on('error', (err) => {
console.log('Server Error:', err.message);
serverSocket.destroy();
clientSocket.end();
});
});
server.listen(PROXY_PORT, () => {
console.log(`Proxy server listening on port ${PROXY_PORT}`);
});
function checkSecurity(socket) {
// Implement your security checks here
// For example, check the IP address
return true;
const allowedIPs = ['192.168.1.100', '192.168.1.101', 'localhost'];
return allowedIPs.includes(socket.remoteAddress);
}
(note this is not Typescript)
Implementation
- Create a page for people to opt-in to Postgres exposure (will be within integrations, probably under "Metabase" or something like that intially), if they've opted in then they can click to reveal the postgres user/password/host.
- On the backend side we should add a column to keep track whether Postgres exposure was enabled and if a password was set (we've done db-side encryption already for remoteServer options). I'd say we should do two columns for enablement (postgresEnabledAt) and password, that way if someone toggle on-off then password remains the same.
- Create the proxy adapting the proof of concept above.
- Improve the proxy to add additional security/logging (e.g. ActivityLogs + report malicious attempts to Sentry?), maybe only allow SELECT command?
- Feature to whitelist IP
From a structure perspective I think the easiest would be to run this as a separate package twenty-postgres-proxy ; add a README, tests, etc. It's possible to tell NestJS to listen on 2 ports but I think that will add some un-necessary slowness and complexity. Since this could be a risky area, keeping the code small, extremely well tested and isolated seems like a good option.