lakeFS icon indicating copy to clipboard operation
lakeFS copied to clipboard

Support rolling database credentials

Open johnnyaug opened this issue 3 years ago • 3 comments

Allow to safely replace the database credentials used by lakeFS to connect to Postgres. This is useful, for example, when connecting to AWS RDS using IAM roles: this method provides credentials that expire after 15 minutes. This was brought up on this Slack thread.

johnnyaug avatar Aug 18 '22 19:08 johnnyaug

A few thoughts/clarifications here as the original thread author.

  • I was looking into a CDK Stack to spin up LakeFS on ECS and didn't want to expose my RDS password in my CloudFormation template or ECS environment variable.
  • RDS supports IAM-based authentication so I was hopeful I could use that in lieu of a hard-coded password.

I found that the IAM-based authentication is, in reality, a process that requires calling the generate-db-auth-token to generate a short-lived token that can then be used as the password to connect to the database. Hence the "rolling" nature of the credential.

I modified db/connect.go in LakeFS to try to make use of this token (see code below) and, while that worked for the initial connection, it looks like MigrateUp in db/migration.go makes use of the original database parameters and raised an exception because the password field was missing.

db/connect.go modification
	if config.ConnConfig.Password == "" {
		log.Warn("No password provided for database connection, attempting IAM authentication")
		dbEndpoint := fmt.Sprintf("%s:%d", config.ConnConfig.Host, config.ConnConfig.Port)
		awsRegion := "us-west-2"
		conf := aws.NewConfig().WithRegion(awsRegion)
		sess := session.Must(session.NewSession(conf))
		creds := sess.Config.Credentials
		authToken, err := rdsutils.BuildAuthToken(dbEndpoint, awsRegion, config.ConnConfig.User, creds)
		if err != nil {
			panic(err)
		}
		config.ConnConfig.Password = authToken
	}

What ended up working for me, at least for now to not expose the password, was disabling IAM auth and using a normal password stored in Secrets Manager. ECS supports a "secrets" section where it can populate environment variables from a value in Secrets Manager. I noticed that pgconn.ParseConfig supports merging environment variables into the connection string. So in my ECS config, I set LAKEFS_DATABASE_CONNECTION_STRING to postgresql://lakefsadmin@{hostname}:{port}/postgres (notice no password) and then I set PGPASSWORD in my secrets to the ARN of my database password secret in Secrets Manager. And lakefs/pgx magically merges the password into my connection string. 🙌

For reference, here's the (python) CDK resource I ended up with for running this on Fargate in ECS:

connection_string = f"postgresql://lakefsadmin@{rds.db_instance_endpoint_address}:{rds.db_instance_endpoint_port}/postgres"

lakefs_service = ecs_patterns.ApplicationLoadBalancedFargateService(
    self,
    "lakefs-service",
    cluster=cluster,
    memory_limit_mib=1024,
    desired_count=1,
    cpu=512,
    task_image_options=ecs_patterns.ApplicationLoadBalancedTaskImageOptions(
        image=ecs.ContainerImage.from_registry(
            "treeverse/lakefs:latest",
            credentials=secretsmanager.Secret.from_secret_name_v2(
                    self, "DockerHubPAT", "dev/DockerHubSecret"
                )
        ),
        container_port=8000,
        execution_role=lakefs_dbrole,   # This role is used when provisioning containers (will be granted access to dockerhub secret)
        task_role=lakefs_containerrole, # This role is used by the running container (needs access to database.secret)
        environment={
            "LAKEFS_BLOCKSTORE_TYPE": "s3",
            "LAKEFS_DATABASE_CONNECTION_STRING": connection_string,
        },
        secrets={
            "LAKEFS_AUTH_ENCRYPT_SECRET_KEY": ecs.Secret.from_secrets_manager(
                secret_key
            ),
            "PGPASSWORD": ecs.Secret.from_secrets_manager(database.secret, "password"),
        },
    ),
)

dacort avatar Aug 19 '22 19:08 dacort

This issue is now marked as stale after 90 days of inactivity, and will be closed soon. To keep it, mark it with the "no stale" label.

github-actions[bot] avatar Nov 01 '23 14:11 github-actions[bot]

Marking "not stale" until proven otherwise, this all sounds convincingly like an issue for some particular configuration on AWS.

arielshaqed avatar Nov 07 '23 08:11 arielshaqed