overwatch
overwatch copied to clipboard
Client Secret Logged in Plaintext when Using AAD Authentication for Eventhub
Overwatch Version 0.7.2.1
Describe the bug When using AAD-based authentication for eventhubs, the value of the client secret is logged without redaction.
Screenshots
Hi williamdphillips,
Please use the azure key vault and configure all the secrets within the key vault and provide the secret names in the input overwatch configuration sheet.
Hi @Sandhya-ravindranath, we are using Azure KeyVault with a linked secret scope. We have provided the name of the secret in the config file, yet the resolved secret value is being logged.
Hi @williamdphillips,
Could you kindly share the configuration sheet with me? I would like to review its contents.
Hi @Sandhya-ravindranath, here are the values from the config. I've redacted many of them as to avoid sharing unnecessary/private information. It seems that Overwatch is resolving the secret value for "clientSecret" and logging that as well as storing it in the deployment report table. Any user with access to the logs or table will now have full access to the spn.
| workspace_name | workspace_id | workspace_url | api_url | cloud | primordial_date | storage_prefix | etl_database_name | consumer_database_name | secret_scope | secret_key_dbpat | auditlogprefix_source_path | eh_name | eh_conn_string | aad_tenant_id | aad_client_id | aad_client_secret_key | aad_authority_endpoint | interactive_dbu_price | automated_dbu_price | sql_compute_dbu_price | jobs_light_dbu_price | max_days | excluded_scopes | active | proxy_host | proxy_port | proxy_user_name | proxy_password_scope | proxy_password_key | success_batch_size | error_batch_size | enable_unsafe_SSL | thread_pool_size | api_waiting_time | mount_mapping_path |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| REDACTED | REDACTED | REDACTED | REDACTED | Azure | 2022-05-20 | dbfs:/mnt/overwatch/dev_test | overwatch_etl | overwatch | dataeng-scope | AZUSE2CORTEXDATAENGNPDDBX1-PAT | cortexameuse2dataengnpd-overwatch-eh1 | Endpoint=sb://cortexameuse2dataengnpdeh-ns.servicebus.windows.net | REDACTED | REDACTED | clientSecret | https://login.microsoftonline.com/ | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE |
Hi @williamdphillips ,
I could see the event hub connection string column has a direct value instead of any secret key name from the keyvault, and hence it is printing out in the run. Similarly, if you've got the direct value for aad_tenant_id and aad_client_id, then it will display the same values.
Hi @mohanbaabu1996, the issue here is client secret, no other values. Our secret name is clientSecret as shown in the table I previously shared, but its resolved value is being logged in plain text as well as being written to tables in plain text.
Hey @williamdphillips , thank you for bringing this to our attention, we'll incorporate the masking into our next release.
@gueniai Thank you - just FYI today I noticed this issue is no longer happening. Was a fix put out that didn't require a change on our side?
@williamdphillips well that's interesting. No, we haven't done any updates to encode the output of the secrets when writing to stdout. Did anything else change on the job? Databricks Runtime, Overwatch version or Spark version?
@gueniai No nothing has changed on the job, cluster, etc. We did recently connect the workspace to UC (but are still using hive metastore until we move through the migration process).
Although the secrets are redacted from logs, I still do see them in the deployment report workspaceDetails column so if anything I think that could be rectified as part of this issue.
Let's check whether this is still happening during the next milestone.