soda-core
soda-core copied to clipboard
Docker container local configuration
Please review these assumptions first:
You have build a docker container to run a scan. In production, the docker container uses a BigQuery service account to run the scan. Developers want to run that docker container locally to test scans. When developers run it locally they want the ability to use a different account, a non-service account.
With those assumptions I would propose the following approach:
The scan uses 1 or more configuration files as input, next to the SodaCL check files.
The configuration files contain the connection details including GCP account credentials.
So the goal here is to use different configuration files in production as locally on the developers laptops.
The scan configuration files can be referenced in the command line with the -c
option eg:
soda scan -d bq -c configuration-bg-prod.yml checks.yml
and
soda scan -d bq -c configuration-bg-dev.yml checks.yml
See also https://docs.soda.io/soda-core/scan-core.html#anatomy-of-a-scan-command
Does this help to find a solution for
- Configuring the local developer account credentials
- Avoiding the use of the service account when it's not wanted ?
SODA-1129
Hi 👋🏼
Not sure if it covers your DTAP-needs, but I currently use docker run -e KEY=VALUE
to pass on variables from host to container, where VALUE
depends on which DTAP-environment it should run in. The provided configuration.yml is configured with variables - as it can now call the provided VALUE
's as available system environment variables. Resulting in one image/container suitable for multiple environments, configurable at run-time. I use it to run Soda's docker image in 4 different environments with Azure Pipelines or whatever CI/CD tool applicable. But it should be perfectly suitable for ad hoc use from CLI.
configuration.yml:
data_source your_data_source:
type: sqlserver
connection:
host: ${SQL_SERVER}
username: ${SQL_USERNAME}
password: ${SQL_PASSWORD}
database: ${SQL_DATABASE}
schema: ${SQL_SCHEMA}
trusted_connection: false
encrypt: true
trust_server_certificate: false
soda_cloud:
host: cloud.soda.io
api_key_id: ${SODA_API_ID}
api_key_secret: ${SODA_API_SECRET}
Command to provide values for the variables being called from configuration.yml (using PowerShell Core-syntax in Azure Pipelines, but you get the point):
docker run `
--rm `
-v /path/to/your_soda_directory:/sodacl `
-e SQL_SERVER=127.0.0.1 `
-e SQL_USERNAME=sa `
-e SQL_PASSWORD=****** `
-e SQL_DATABASE=master `
-e SQL_SCHEMA=dbo `
-e SODA_API_ID=ab12345a-1a12-123a-12ab-a12aa1ab1234 `
-e SODA_API_SECRET=****** `
sodadata/soda-core:v3.0.10 scan -d your_data_source -c /sodacl/configuration.yml /sodacl/checks.yml
Hope this helps. Cheers!