cli icon indicating copy to clipboard operation
cli copied to clipboard

Databricks Asset Bundles - Firewall restrictions when deploying bundles - Terraform Provider

Open oumeima-elgharbi opened this issue 1 year ago • 1 comments

Hello,

I am trying to deploy Asset Bundles using the Databricks CLI (I have the CLI v0.208.3 installed inside a Docker container as well a Terraform v1.7.4 - I also tried using Terraform v1.5.5).

The problem is that I am using the Databricks CLI inside my company's network which has network/firewall restrictions. When I used the command "databricks bundle deploy" I got this first error :

$ databricks bundle deploy
Error: error downloading Terraform: Get "https://releases.hashicorp.com/terraform/1.5.5/index.json": EOF

That error got fixed by updating the databricks.yml file for Asset Bundles. Instead of downloading Terraform v1.5.5, we use the Terraform binary installed in the Docker image :
(from this issue https://github.com/databricks/cli/issues/782)

bundle:
  terraform:
    exec_path: PATH_TO_TERRAFORM_CLI

But now, I get another error about the Terraform provider :

$ databricks bundle deploy
Starting upload of bundle files
Uploaded bundle files at /Users/[MASKED]/.bundle/test_default_python/datalab/files!
Starting resource deployment
Error: terraform init: exit status 1
Error: Failed to query available provider packages
Could not retrieve the list of available versions for provider
databricks/databricks: could not connect to registry.terraform.io: failed to
request discovery document: Get
"https://registry.terraform.io/.well-known/terraform.json": EOF

The CLI makes GET requests but inside the Docker container the firewall blocks the access to these URL and even if I use a proxy these URLs are not accessible.

Is there a way to not make the GET request for the Terraform provider or stop the Databricks CLI from making GET requests ?

Thanks.

oumeima-elgharbi avatar Feb 26 '24 16:02 oumeima-elgharbi

For the record, I got this working for a CI/CD release (our build agents don't have unrestricted internet access). Here's what I had to do:

  1. Install terraform (1.5.5 in this case) on the build agent and add it to the system path. (In your case, replace "build agent" with "container image" in these instructions.)
  2. Set the terraform executable path in the bundle file to "terraform" as noted in issue #782 (this picks it up from the system path and means it doesn't have to be hardcoded, so it works locally as well as on the build agent):
bundle:
  terraform:
    exec_path: terraform
  1. Download the Databricks terraform provider and get it onto the build agent - in our case we just put it in source control in our databricks repository. (You could also put it in a known location on the build agent.) You can grab the cached provider(s) from the .databricks/bundle/stage/terraform/.terraform folder - just copy the whole "providers" folder.

In the build, we added the following steps:

  1. Create a terraform.rc file that sets up a filesystem mirror for the Databricks terraform provider:
provider_installation {
  filesystem_mirror {
    path = "/path/to/providers_folder
    include = ["*/*/*"]
  }
}
  1. Set the TF_CLI_CONFIG_FILE environment variable to the path to the above terraform.rc file. If you are running this process interactively, you can just set your ~/.terraformrc file to contain the above instead.
  2. Manually write the terraform.hcl.lock file to force terraform to use the local provider. This prevents it reaching out to registry.terraform.io. You can grab the generated terraform.hcl.lock file from the bundle's .databricks/bundle/stage/terraform/ folder after deploying it, or you can get the hash of the version it's trying to download and manually create the file. It needs to be created in the .databricks/bundle/stage/terraform/ folder (you will need to make sure the folder exists; just create it if not) before calling "databricks bundle deploy". An example of the one we create (for databricks CLI version 0.214.1) is below.
provider "registry.terraform.io/databricks/databricks" {
  version     = "1.37.0"
  constraints = "1.37.0"
  hashes = [
    "h1:MM/wGk5KCWl/6IVxyRfiQAVwiUIQll73a5X87zT7N/Q=",
  ]
}
  1. You can then run databricks bundle deploy without either databricks or terraform requiring access to any public internet URLs.

This is obviously not suitable as a long-term permanent solution - every time there is a databricks CLI update or terraform provider update we would need to update the databricks CLI, update the terraform provider, update the code that generates the terraform.hcl.lock file, and of course if the way the databricks CLI calls terraform changes we'd have to reverse engineer and update that.

Databricks needs to support network-isolated use cases. Ideally for me, the databricks CLI would just directly use the code found in the terraform provider (rather than using an external terraform and provider EXEs, it should embed the same code) to do what it needs to so we don't have to keep 3 different executables updated, generate local lock files and filesystem mirrors, etc.

ericmeans3cloud avatar Mar 22 '24 14:03 ericmeans3cloud

We now provide an official Docker container packages with everything needed without making network calls https://github.com/databricks/cli/pkgs/container/cli

andrewnester avatar Jul 29 '24 13:07 andrewnester

Perhaps this could still be reconsidered. Docker images are fine, and yes, you can easily create a private registry and push it in there. But that's yet another cost and maintenance. Would still love to see a better workaround (or permanent solution) which @ericmeans3cloud discovered.

Gijsreyn avatar Feb 20 '25 10:02 Gijsreyn

@Gijsreyn It indeed is a top priority on our internal roadmap. We are actively looking into solutions and prototypes to remove the dependency on the executable all together. I can't comment on a timeline just yet though.

shreyas-goenka avatar Feb 20 '25 11:02 shreyas-goenka