ucx
ucx copied to clipboard
[FEATURE]: Migrate direct filesystem access to UC tables
We have instances of spark.read.format("delta").load("s3a://prefix/...")
in the code, though we want to migrate that into spark.table("catalog.schema.table")
to follow UC practices. Build on top of "tables in mounts". See:
- https://github.com/databrickslabs/ucx/issues/1225
Do we migrate to UC Volumes?
yes
Do we resolve mounts?
yes
Do we resolve dbutils.widgets.get()
?
if possible
where to store mappings? add a prefix in the table mapping?
TBD
what scans all jobs?
- assessment workflow
-
migration-progress
(new) workflow on a daily schedule
See:
- https://github.com/databrickslabs/ucx/issues/2074
what determines all direct filesystem accesses?
- Extend
FromDbfsFolder
,DirectFilesystemAccessMatcher
, andFromTable
to return file access. - Add new matchers for
open('/dbfs/...')
literals. - Modify
WorkflowLinter
to persist this information in a new table.