spark icon indicating copy to clipboard operation
spark copied to clipboard

[SPARK-45789][SQL] Support DESCRIBE TABLE for clustering columns

Open imback82 opened this issue 1 year ago • 1 comments

What changes were proposed in this pull request?

This PR proposes to add clustering column info as the output of DESCRIBE TABLE.

Why are the changes needed?

Currently, it's not easy to retrieve clustering column info; you can do it via catalog APIs.

Does this PR introduce any user-facing change?

Yes. Now, when you run DESCRIBE TABLE on clustered tables, you will see the "Clustering Information" as follows:

CREATE TABLE tbl (col1 STRING, col2 INT) using parquet CLUSTER BY (col1, col2);
DESC tbl;

+------------------------+---------+-------+
|col_name                |data_type|comment|
+------------------------+---------+-------+
|col1                    |string   |NULL   |
|col2                    |int      |NULL   |
|# Clustering Information|         |       |
|# col_name              |data_type|comment|
|col1                    |string   |NULL   |
|col2                    |int      |NULL   |
+------------------------+---------+-------+

How was this patch tested?

Added new unit tests.

Was this patch authored or co-authored using generative AI tooling?

No

imback82 avatar Feb 10 '24 00:02 imback82

cc @cloud-fan

imback82 avatar Feb 10 '24 00:02 imback82

thanks, merging to master!

cloud-fan avatar Feb 20 '24 06:02 cloud-fan