databricks-sdk-py icon indicating copy to clipboard operation
databricks-sdk-py copied to clipboard

sql connection using service principal to specific table

Open chkp-orsaa opened this issue 1 year ago • 2 comments

I try to authenticate using service principal client id and secret. I used the following:

    def credential_provider():
        config = Config(
            host=f"https://{server_hostname}",
            client_id=databricks_client_id,
            client_secret=databricks_client_secret)
        return oauth_service_principal(config)

    with sql.connect(server_hostname=server_hostname,
                     http_path=http_path,
                     credentials_provider=credential_provider) as connection:
        cur = connection.cursor()

and received the following logs with error" Error during request to server". Do my authentication values the issue? or something else?

retry parameter: _retry_delay_min given_or_default 1.0
retry parameter: _retry_delay_max given_or_default 60.0
retry parameter: _retry_stop_after_attempts_count given_or_default 30
retry parameter: _retry_stop_after_attempts_duration given_or_default 900.0
retry parameter: _retry_delay_default given_or_default 5.0
Sending request: TOpenSessionReq(client_protocol=None, username=None, password=None, configuration={'spark.thriftserver.arrowBasedRowSet.timestampAsString': 'false'}, getInfos=None, client_protocol_i64=42247, connectionProperties=None, initialNamespace=None, canUseMultipleCatalogs=True, sessionId=None)
Error during request to server: {"method": "OpenSession", "session-id": null, "query-id": null, "http-code": null, "error-message": "", "original-exception": "invalid_client: Client authentication failed", "no-retry-reason": "non-retryable error", "bounded-retry-delay": null, "attempt": "1/30", "elapsed-seconds": "0.353135347366333/900.0"}
retry parameter: _retry_delay_min given_or_default 1.0
retry parameter: _retry_delay_max given_or_default 60.0
retry parameter: _retry_stop_after_attempts_count given_or_default 30
retry parameter: _retry_stop_after_attempts_duration given_or_default 900.0
retry parameter: _retry_delay_default given_or_default 5.0
Sending request: TOpenSessionReq(client_protocol=None, username=None, password=None, configuration={'spark.thriftserver.arrowBasedRowSet.timestampAsString': 'false'}, getInfos=None, client_protocol_i64=42247, connectionProperties=None, initialNamespace=None, canUseMultipleCatalogs=True, sessionId=None)
Error during request to server: {"method": "OpenSession", "session-id": null, "query-id": null, "http-code": null, "error-message": "", "original-exception": "invalid_client: Client authentication failed", "no-retry-reason": "non-retryable error", "bounded-retry-delay": null, "attempt": "1/30", "elapsed-seconds": "0.35685276985168457/900.0"}


Windows 10

chkp-orsaa avatar Feb 04 '24 12:02 chkp-orsaa

Does the service principal itself work? If you take your config and do

w = WorkspaceClient(config)
w.current_user.me()

does it succeed?

mgyucht avatar Feb 07 '24 07:02 mgyucht

I'm not sure if this is related or not, but it might help others that are facing a similar issue.

We started getting the same error (invalid_client: Client authentication failed) a few days ago, nothing has changed on our code base/infrastructure and it was working perfectly before.

For context, we're using Azure Databricks and connecting via service principal. We've been using the client id and secret from Microsoft Entra ID.

After digging a little bit on the stack trace, we confirmed the issue was coming from the call to get an access token, which should be something like this, according to the docs:

curl -X POST \
<per-workspace-url>/oidc/v1/token \
-d "grant_type=client_credentials" \
-d "scope=all-apis" \
-u "<service-principal-id>:<oauth-secret>"

We've then tried the same request using the same credentials from Microsoft Entra ID and confirmed the error was coming from this call.

Reading a bit more the docs we saw on Step 6: Create an Azure Databricks OAuth secret for the service principal that we should be using a Databricks OAuth Secret (apparently something new, since we've been using only the service principal credentials from Microsoft Entra ID). So we have it a try... I created a new token, sent it on the same request as before and it worked fine.

We believe this was introduced by this change, however we're not sure if this was an intended (and miscommunicated) breaking change or a regression bug, because it did work fine before 🤷🏼

CaioCavalcanti avatar Feb 08 '24 19:02 CaioCavalcanti