twenty icon indicating copy to clipboard operation
twenty copied to clipboard

Postgres proxy on cloud

Open FelixMalfait opened this issue 9 months ago • 0 comments

Context

We have a multi tenant-cloud but use 1 schema per workspace. We could leverage this to our advantage to let people access their data externally (e.g. via Metabase or Tableau) We need to find a way to do that in a secure way (we cannot expose Postgres publicly/for all tenants)

Proposal

My initial proposal was to setup a new Postgres DB that we would use as a proxy with foreign data wrappers pointing to our read-replica. That way we would only expose the data for people that opt-in + we could add additional security layers (connection limit enforced by the read replica, read-only at prod db-level + read-replica level, additional logging tailored specifically for the proxy etc).

Then talking to @Freebios we realized that building a wrapper in Node wasn't actually complex. It feels a bit hacky as we're almost at the packet level but it seems to work reasonably well

Proof of concept

Here's a sample code that works (IP whitelisting part wasn't tested)

const net = require('net');

const HOST = 'localhost'; // PostgreSQL server host
const PG_PORT = 5432; // PostgreSQL server port
const PROXY_PORT = 5433; // Port for proxy server to listen on

// Create a server that listens for client connections
const server = net.createServer((clientSocket) => {
    console.log('Client connected.');

    const checkPassed = checkSecurity(clientSocket);
    if (!checkPassed) {
        console.log('Security check failed');
        socket.end('Security check failed\n');
        return;
    }

    // Connect to the PostgreSQL server
    const serverSocket = net.connect({ host: HOST, port: PG_PORT }, () => {
        console.log('Proxy connected to PostgreSQL server.');
    });

    // Relay data from client to PostgreSQL server
    clientSocket.on('data', (data) => {
        console.log('SQL Query:', data.toString()); // Log the SQL query
        serverSocket.write(data);
    });

    // Relay data from PostgreSQL server back to client
    serverSocket.on('data', (data) => {
        console.log('PostgreSQL Response:', data.toString()); // Log the PostgreSQL response
        clientSocket.write(data);
    });

    // Handle client disconnect
    clientSocket.on('end', () => {
        console.log('Client disconnected.');
        serverSocket.end();
    });

    // Handle server disconnect
    serverSocket.on('end', () => {
        console.log('PostgreSQL server connection closed.');
        clientSocket.end();
    });

    // Error handling for the client
    clientSocket.on('error', (err) => {
        console.log('Client Error:', err.message);
        clientSocket.destroy();
        serverSocket.end();
    });

    // Error handling for the server connection
    serverSocket.on('error', (err) => {
        console.log('Server Error:', err.message);
        serverSocket.destroy();
        clientSocket.end();
    });
});

server.listen(PROXY_PORT, () => {
    console.log(`Proxy server listening on port ${PROXY_PORT}`);
});


function checkSecurity(socket) {
    // Implement your security checks here
    // For example, check the IP address

    return true;

    const allowedIPs = ['192.168.1.100', '192.168.1.101', 'localhost'];
    return allowedIPs.includes(socket.remoteAddress);
}

(note this is not Typescript)

Implementation

  1. Create a page for people to opt-in to Postgres exposure (will be within integrations, probably under "Metabase" or something like that intially), if they've opted in then they can click to reveal the postgres user/password/host.
  2. On the backend side we should add a column to keep track whether Postgres exposure was enabled and if a password was set (we've done db-side encryption already for remoteServer options). I'd say we should do two columns for enablement (postgresEnabledAt) and password, that way if someone toggle on-off then password remains the same.
  3. Create the proxy adapting the proof of concept above.
  4. Improve the proxy to add additional security/logging (e.g. ActivityLogs + report malicious attempts to Sentry?), maybe only allow SELECT command?
  5. Feature to whitelist IP

From a structure perspective I think the easiest would be to run this as a separate package twenty-postgres-proxy ; add a README, tests, etc. It's possible to tell NestJS to listen on 2 ports but I think that will add some un-necessary slowness and complexity. Since this could be a risky area, keeping the code small, extremely well tested and isolated seems like a good option.

FelixMalfait avatar May 11 '24 08:05 FelixMalfait