[Feature Request] DABs: allow target specific includes
I am currently working on a project in the pharmaceutical sector within a GxP-validated environment. Operating under GxP (Good Practice) standards means we must exercise rigorous control over everything that is deployed across our environments development, staging, and production. Each of which has its own dedicated workspace. This heightened scrutiny is essential, as GxP compliance is not optional but a legal and regulatory requirement designed to ensure the safety, quality, and integrity of pharmaceutical products at every stage of their lifecycle.
Every deployment must meet strict validation and documentation standards, as even minor deviations can have significant regulatory and patient safety implications. This approach safeguards not only our compliance with global authorities such as the FDA and EMA, but also the trust of patients and healthcare providers who rely on the quality and safety of our products.
So, we are looking for a feature like:
A - Flexibility to have Workspace specific includes
targets:
dev_01:
include:
- workflows/dev_01/*.yml
B - Ability to interpolate a directory in the Includes Something like
include:
- workflows/{bundle.target}/*.yml
Currently, our workaround involves creating multiple databricks.yml files and renaming them during CD, depending on the target environment. While this approach allows us to move forward, it is ultimately a stopgap rather than a robust solution.
In the pharmaceutical industry, GxP Validation is not just a standard process-it’s a critical requirement that ensures compliance, data integrity, and patient safety across the sector. This need for flexible environment configuration is not unique to our team; it’s a recurring challenge across various use cases in the industry. Implementing a more streamlined and scalable solution would deliver significant value. Not only to us but to the entire pharmaceutical sector-by enhancing efficiency, reducing risk, and supporting industry-wide best practices.
We also required similar feature in Kaiser Environment
@Escoto if you define resources under specific target, these resources will be deployed only when this target is deployed and not others practically achieving exactly what you want. The general suggestion is:
- If you have resources that should be deployed for all targets, define them as top-level resources, f.e
resources:
jobs:
my_job: ...
- If you have resources which needs to be deployed only for the target, define them under the target section
targets:
dev:
resources:
jobs:
my_job_specific: ...
You can always verify which resources will be deployed by running databricks bundle validate -t <target> and databricks bundle summary -t <target> to avoid any potential drift.
Is there any reason why this pattern would not work for your use case?
Here's an example of how the requested behaviour can be achieved with target overrides today. Please take a look https://github.com/databricks/bundle-examples/pull/96
@andrewnester This is quite inconvenient approach. Imagine a situation where you want to gradually release something between environments and you have different responsible/approvers for configuration in them environments. (e.g. dev team decides what to deploy in DEV environment, release team decides what to enable in PREPROD and PROD environments). Using targets I can easily split the responsibilities in separate files. However using the proposed approach it is not quite clear what is enabled in DEV, PREPROD and PROD unless you go and validate every file (or execute databricks validate). So definitely include per target will be helpful.
P.S. I tried another approach which also does not work.
target:
prod:
default: false
workspace: <Omitted>
variables: <Omitted>
# Excluded job resources from the environment. These will be enabled as soon as properly tested.
resources:
jobs:
job_under_development_1: {}
job_under_development_2: {}