dbt-core icon indicating copy to clipboard operation
dbt-core copied to clipboard

[CT-1484] [Feature] SSO for Redshift

Open dnascimento opened this issue 2 years ago • 13 comments

Is this your first time submitting a feature request?

  • [X] I have read the expectations for open source contributors
  • [X] I have searched the existing issues, and I could not find an existing issue for this feature
  • [X] I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Redshift now supports native IdP Federation, for instance Azure AD. Could we add support for that?

Describe alternatives you've considered

The IAM permission is ok but it doesn't allow to distinguish and manage users as easily as AzureAD.

Who will this benefit?

Everyone using Redshift with IdP

Are you interested in contributing this feature?

yes

Anything else?

No response

dnascimento avatar Nov 09 '22 12:11 dnascimento

Thanks for reaching out @dnascimento !

Were you thinking about Redshift SSO within dbt Core or within dbt Cloud?

dbeatty10 avatar Nov 14 '22 16:11 dbeatty10

Hi, dbt core. This is specially useful for developers running tests locally

dnascimento avatar Nov 14 '22 19:11 dnascimento

Yeah, it makes sense to add this support 👍

There's a few things we'd need to do to make this happen:

  1. https://github.com/dbt-labs/dbt-redshift/issues/219
  2. Enable the applicable connection parameters within profiles.yml for dbt-redshift (implementation here using this as an example)
  3. Add the new parameters here (using this as an example)

dbeatty10 avatar Nov 15 '22 00:11 dbeatty10

Thanks for leading this @dbeatty10 ! I can try to help with the other points. It will make dbt easier to consume

dnascimento avatar Nov 15 '22 06:11 dnascimento

Hello. Any update in this? Our redshift is SSO enabled thru Native IdP and there is a strong demand to enable this for end users for local dbt runs.

frolundo avatar Jun 28 '23 01:06 frolundo

Same setup for us (IdP) and it’s a high priority to switch to SSO for both dbt core and dbt cloud

misteliy avatar Jul 01 '23 06:07 misteliy

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions[bot] avatar Jan 02 '24 01:01 github-actions[bot]

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

github-actions[bot] avatar Jan 10 '24 01:01 github-actions[bot]

Thanks for reopening!

misteliy avatar Jan 16 '24 06:01 misteliy

Hi,

Is there any update in this ticket? We use DBT core heavily and managing all data warehouse developers without AzureID is becoming increasingly difficult.

adam-staros95 avatar Apr 29 '24 10:04 adam-staros95

Hi,

As we are using version 1.6, I was trying to make following changes in dbt-redshift library: dbt/adapters/redshift/connections.py:

class RedshiftConnectionMethod(StrEnum):
    DATABASE = "database"
    IAM = "iam"
    SSO = "sso"

###

class RedshiftConnectMethodFactory:
    credentials: RedshiftCredentials

###

            def connect():
                logger.info("Connecting to redshift with domain credentials...")
                c = redshift_connector.connect(
                    iam=False,
                    db_user='',
                    cluster_identifier=self.credentials.cluster_id,
                    scope='<our_scope>',  #only for tests, should be set in 'profiles.yml'
                    client_id='<our_client_id>', #only for tests, should be set in 'profiles.yml'
                    idp_tenant='<our_ idp_tenant>', #only for tests, should be set in 'profiles.yml'
                    listen_port=7890,
                    credentials_provider='redshift_connector.plugin.BrowserAzureOAuth2CredentialsProvider',
                    user='',
                    password='',
                    ** kwargs,
                )
                if self.credentials.autocommit:
                    c.autocommit = True
                if self.credentials.role:
                    c.cursor().execute("set role {}".format(self.credentials.role))
                return c

Next I updated profiles.yml:

jaffle_shop:
  outputs:
    default:
      dbname: <redshift_db>
      host: <redshift_host>
      port: 5439
      schema: staging
      threads: 4
      type: redshift
      ra3_node: true
      user: n/a
      password: n/a
      method: sso

and tested code locally.

During dbt debug command, browser window opens and I'm able to connect to the Redshift cluster.

However, when I execute dbt run command multiple browser windows opens, making it almost not possible to authenticate. Do you know how to avoid such behavior? Is there is an option to somehow cache Redshift credentials after first login?

FYI: I tried similar code with DBT version 1.8, but I had exactly the same problems.

adam-staros95 avatar May 21 '24 08:05 adam-staros95

I managed to solve an issue with opening multiple browser windows using combination of redshift_connector.plugin.BrowserAzureOAuth2CredentialsProvider and redshift_connector.plugin.BasicJwtCredentialsProvider.

Updated dbt/adapters/redshift/connections.py code:

class RedshiftConnectionMethod(StrEnum):
    DATABASE = "database"
    IAM = "iam"
    SSO = "sso"

### 

IDP_TOKEN = ''


def _token_expiring() -> bool:
    tolerance_seconds = 60
    current_timestamp = int(time.time())
    idp_token_payload = IDP_TOKEN.split(".")[1]
    idp_token_payload += '=' * (-len(idp_token_payload) % 4)
    idp_token_expire_time = json.loads((base64.b64decode(idp_token_payload)).decode("utf-8")).get('exp', 0)
    return True if idp_token_expire_time - current_timestamp <= tolerance_seconds else False

class RedshiftConnectMethodFactory:
    credentials: RedshiftCredentials

    def __init__(self, credentials):
        self.credentials = credentials

    def get_connect_method(self):
        global IDP_TOKEN
        method = self.credentials.method
        kwargs = {
            "host": self.credentials.host,
            "database": self.credentials.database,
            "port": int(self.credentials.port) if self.credentials.port else int(5439),
            "auto_create": self.credentials.autocreate,
            "db_groups": self.credentials.db_groups,
            "region": self.credentials.region,
            "timeout": self.credentials.connect_timeout,
        }

        redshift_ssl_config = RedshiftSSLConfig.parse(self.credentials.sslmode)
        kwargs.update(redshift_ssl_config.to_dict())

        # Support missing 'method' for backwards compatibility
        if method == RedshiftConnectionMethod.DATABASE or method is None:
            ###

        elif method == RedshiftConnectionMethod.SSO:
            logger.debug("Connecting to redshift with Azure SSO")
            if IDP_TOKEN == '' or _token_expiring():
                def connect():
                    global IDP_TOKEN
                    c = redshift_connector.connect(
                        iam=False,
                        db_user='',
                        scope='<our_scope>',  # only for tests, should be set in 'profiles.yml'
                        client_id='<our_client_id>',  # only for tests, should be set in 'profiles.yml'
                        idp_tenant='<our_ idp_tenant>',  # only for tests, should be set in 'profiles.yml'
                        listen_port=7890,
                        credentials_provider='redshift_connector.plugin.BrowserAzureOAuth2CredentialsProvider',
                        user='',
                        password='',
                        idp_response_timeout=50,
                        **kwargs,
                    )
                    if self.credentials.autocommit:
                        c.autocommit = True
                    if self.credentials.role:
                        c.cursor().execute(f"set role {self.credentials.role}")
                    IDP_TOKEN = c.web_identity_token
                    return c
            else:
                def connect():
                    global IDP_TOKEN
                    c = redshift_connector.connect(
                        iam=False,
                        credentials_provider='redshift_connector.plugin.BasicJwtCredentialsProvider',
                        password='',
                        web_identity_token=IDP_TOKEN,
                        **kwargs,
                    )
                    if self.credentials.autocommit:
                        c.autocommit = True
                    if self.credentials.role:
                        c.cursor().execute(f"set role {self.credentials.role}")
                    return c

I know it is not the cleanest solution, but it currently works for us.

adam-staros95 avatar Jun 03 '24 15:06 adam-staros95