TabularEditor3 icon indicating copy to clipboard operation
TabularEditor3 copied to clipboard

[Issue] Column comments from Databricks not imported

Open hayescode opened this issue 1 year ago • 8 comments

Describe the issue Using the databricks wizard, not all column comments from the source tables are added to the column description. Updating table schema sometimes pulls these in, other times it doesn't. I haven't found a workaround.

** Which version are you currently using? ** Tabular Editor 3, version: 3.7.1

To Reproduce Import a new databricks table using the wizard and entering credentials.

Expected behavior Column comments (and Table comment) are added to the description field in TE.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • Windows version: [10

Additional context Add any other context about the problem here.

hayescode avatar Jun 13 '23 20:06 hayescode

Hi @hayescode You mentioned that not all comments are imported. I'm curious to know why some columns work while others don't. Is there any pattern that distinguishes comments that are imported vs. those that aren't? Maybe line breaks, or special characters in the comment string? Perhaps you can provide a sample of a table where only some of the column comments are imported, or show a few screenshots? Thanks.

otykier avatar Jun 16 '23 07:06 otykier

Hi @otykier I'm on v3.8 now. I just imported a new table but no comments came through at all. I Update Table Schema but it says it's "Up to date". I tried modifying the table schema comments but to no avail.

Here's the table schema I'm using:

CREATE OR REPLACE TABLE BI.BI_FACT_FIN_SCIC_DASHBOARD (
    MONTH DATE COMMENT 'The month of the data in date format(5/22/2023)',
    PRODUCT_TYPE STRING COMMENT 'The product category',
    BUDGET_BARRELS DECIMAL(38,6) COMMENT 'The amount of budgeted barrels',
    ACTUAL_BARRELS DECIMAL(38,6) COMMENT 'The amount of actual barrels',
    DEST_MIX DECIMAL(38,6) COMMENT 'The dollars of destination mix',
    SKU_MIX DECIMAL(38,6) COMMENT 'The dollars of SKU (PDCN) mix',
    BUDGET_DPB_FREIGHT DECIMAL(38,6) COMMENT 'The budgeted dollar per barrel (DPB) for freight',
    LE_DPB_FREIGHT DECIMAL(38,6) COMMENT 'The latest estimate of the dollar per barrel (DPB) for freight',
    ACTUAL_DPB_FREIGHT DECIMAL(38,6) COMMENT 'The actual dollar per barrel (DPB) for freight',
    ACTUAL_DPB_BARRELS DECIMAL(38,6) COMMENT 'The actual number of barrels shipped for dollar per barrel (DPB)',
    BUDGET_DPB_IMPACT DECIMAL(38,6) COMMENT 'The impact of the budgeted dollar per barrel (DPB)',
    LE_DPB_IMPACT DECIMAL(38,6) COMMENT 'The impact of the latest estimate of the dollar per barrel (DPB)',
    BUDGET_RPM_FREIGHT DECIMAL(38,6) COMMENT 'The budgeted freight for Rate Per Mile (RPM)',
    BUDGET_RPM_MILES DECIMAL(38,6) COMMENT 'The budgeted miles for Rate Per Mile (RPM)',
    LE_RPM_FREIGHT DECIMAL(38,6) COMMENT 'The latest estimate freight for Rate Per Mile (RPM)',
    LE_RPM_MILES DECIMAL(38,6) COMMENT 'The budgeted miles for Rate Per Mile (RPM)',
    ACTUAL_RPM_FREIGHT DECIMAL(38,6) COMMENT 'The actual freight for Rate Per Mile (RPM)',
    ACTUAL_RPM_MILES DECIMAL(38,6) COMMENT 'The budgeted miles for Rate Per Mile (RPM)',
    BUDGET_RPM_IMPACT DECIMAL(38,6) COMMENT 'The impact of the budgeted Rate Per Mile (RPM)',
    LE_RPM_IMPACT DECIMAL(38,6) COMMENT 'The impact of the latest estimate Rate Per Mile (RPM)',
    BUDGET_MPB_MILES DECIMAL(38,6) COMMENT 'The budgeted miles for ABINC miles per barrel (MPB)',
    LE_MPB_MILES DECIMAL(38,6) COMMENT 'The latest estimate miles for ABINC miles per barrel (MPB)',
    ACTUAL_MPB_MILES DECIMAL(38,6) COMMENT 'The actual miles for ABINC miles per barrel (MPB)',
    ACTUAL_MPB_BARRELS DECIMAL(38,6) COMMENT 'The actual barrels for ABINC miles per barrel (MPB)',
    BUDGET_MPB_IMPACT DECIMAL(38,6) COMMENT 'The impact of the budgeted ABINC miles per barrel (MPB)',
    LE_MPB_IMPACT DECIMAL(38,6) COMMENT 'The impact of the latest estimate ABINC miles per barrel (MPB)', 
    DBX_LAST_MODIFIED TIMESTAMP NOT NULL COMMENT 'Databricks last modified this record'
)

hayescode avatar Jun 30 '23 18:06 hayescode

Thank you @hayescode

We'll investigate and get back to you.

DBojsen avatar Jul 18 '23 11:07 DBojsen

@DBojsen I am on version 3.11 now and still sometimes column comments don't come over. It's only sometimes. Other times they do come over and I'm not doing anything different.

Also I've never had the table comments imported so maybe that feature just doesn't exist? It would be good to have that but that's low priority tbh.

hayescode avatar Oct 20 '23 12:10 hayescode

@hayescode Good to know that it does work most of the time. If you are able to identify the conditions for this behaviour we would be interested in your findings, as it would make it a lot easer to solve :)

mlonsk avatar Oct 23 '23 09:10 mlonsk

@otykier @mlonsk I'm on 3.15 now and I have not seen the columns come through at all since my last post.

Here's an example DDL and what I'm seeing. This example starts with a brand new TE model.

CREATE OR REPLACE TABLE LOGISTICS_DEV.BI.BI_DIM_COMMODITY (
  `Commodity` STRING COMMENT 'The type of commodity of a product or load.',
  `Commodity Type` STRING COMMENT 'The category/bucket for the commodity',
  DBX_LAST_MODIFIED TIMESTAMP NOT NULL COMMENT 'Databricks last modified this record',
  CONSTRAINT bi_dim_commodity_pk PRIMARY KEY (`Commodity`)
) COMMENT 'A Dimension table that stores Commodity codes/descriptions for shipments and products'
TBLPROPERTIES (
  delta.autoOptimize.optimizeWrite = true,
  delta.autoOptimize.autoCompact = true,
  'delta.columnMapping.mode' = 'name'
);

image

System:

  • Windows 11
  • TE 3.15
  • Azure Databricks + Unity Catalog

Ideally new/update table schema would bring in and configure:

  • Column Comments
  • Table Comments
  • Mark Primary Keys (although composite PK is tricky)

hayescode avatar May 17 '24 11:05 hayescode

Hi @hayescode Thank you for the additional information. I can definitely see how this is a useful thing to have when using the Databricks connector. I actually do not think pulling in the comments as descriptions was part of the original intention of the Databricks import wizard support.

mlonsk avatar May 28 '24 12:05 mlonsk

@mlonsk ah interesting. This would be a nice feature request then ;)

hayescode avatar May 28 '24 13:05 hayescode