terraform-provider-databricks icon indicating copy to clipboard operation
terraform-provider-databricks copied to clipboard

[DOC] Improved documentation of services and listing arguments for the experimental exporter

Open oliverangelil opened this issue 1 year ago • 4 comments

Affected Resource(s)

https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/guides/experimental-exporter

Expected Details

I do not find the explanation of the services and listing arguments clear enough. After reading the descriptions of both I still have no idea how to use them. Both descriptions seem to imply that they are lists of services for importing. More detail with examples would be nice.

For example, what is the difference between

-services=groups,secrets,access,compute,users,jobs,storage
-listing=jobs,compute 

and

-services=compute,jobs
-listing=jobs,compute

and (all services taken for the services argument):

-listing=jobs,compute

oliverangelil avatar Mar 21 '24 18:03 oliverangelil

-listing is used to discover objects of the give service. -services filters out what dependent services will be processed (all by default). I.e., -listing jobs will discover all jobs, and by default will process all subobjects, like, notebooks, SQL objects, owners, etc. But if you use -listing jobs together with -services notebooks,dlt,storage, then jobs will be exported only together with notebooks, DLT pipelines and files on DBFS/UC files.

alexott avatar Mar 21 '24 18:03 alexott

Still confusing to me...

-listing jobs will discover all jobs, and by default will process all subobjects, like, notebooks, SQL objects, owners, etc

-listing jobs alone (with all -services as default), produces access.tf, dlt.tf, groups.tf, jobs.tf, repos.tf, users.tf files in my case. What do these other files (access, dlt, groups, repos) have to do with jobs? And why were these 5 .tf files created and not others?

-listing jobs with -services notebooks,dlt,storage produces jobs.tf & dlt.tf files . Again pretty random to me...

When I remove the -listing argument completely (considers all services by default) and only keep -services notebooks,dlt,storage the command takes much longer to run and produces a .tf file for ~15 services.

I cannot figure out the pattern that determines what .tf files will be produced. All seems pretty random.

oliverangelil avatar Mar 21 '24 19:03 oliverangelil

-listing is used to discover resources to export; if it's not specified, then all services are listed (if they have the List operation implemented). -services restricts the export of resources only to those resources whose service type is in the list specified by this option.

For example, you have a job comprising two notebooks and one SQL dashboard, and tasks have Python libraries on DBFS attached. If you just do the -listing jobs, then it will export the following resources:

  • job itself
  • two notebooks
  • directory where notebooks reside
  • libraries from DBFS
  • SQL dashboard, SQL queries that are used in it, and SQL warehouse that is used to run dashboard/queries
  • directory where SQL objects reside
  • permissions for all objects above
  • user/group information based on permissions, and directories

And it will create references/dependencies between these objects

but if you specify -services notebooks,storage then it will export only:

  • job itself
  • two notebooks
  • directory where notebooks reside
  • libraries from DBFS

the rest will be hardcoded, like SQL object IDs, etc.

alexott avatar Mar 22 '24 08:03 alexott

Thanks @alexott this was very useful. I came here searching for help because I was also confused by the differences between listing and services from the current documentation. If we could add your example there to the docs, I think it'll help a lot of people in the future. Thanks

scottclay avatar May 05 '24 11:05 scottclay