trino icon indicating copy to clipboard operation
trino copied to clipboard

Not all tables from system.metadata are access control aware

Open huberty89 opened this issue 2 years ago • 2 comments

I notice some tables from system.metadata does not respect rules from File-based Access Control, I got the rules:

{
    "catalogs": [],
    "tables": [{
        "catalog": "system",
        "schema": "metadata",
        "privileges": [ "SELECT" ]
    }]
}

so I should not see any catalogs besides system but few tables from system.metadata reveals information what catalogs are configured.

Access control aware Table name
analyze_properties
catalogs
column_properties
materialized_view_properties
materialized_views
schema_properties
table_comments
table_properties
trino> SHOW CATALOGS;
 Catalog
---------
 system
(1 row)
trino> SELECT * FROM system.metadata.analyze_properties;
 catalog_name |    property_name     | default_value |            type             |                         description
--------------+----------------------+---------------+-----------------------------+-------------------------------------------------------------
 delta        | columns              |               | array(varchar)              | Columns to be analyzed
 delta        | files_modified_after |               | timestamp(3) with time zone | Take into account only files modified after given timestamp
 hive         | columns              |               | array(varchar)              | Columns to be analyzed
 hive         | partitions           |               | array(array(varchar))       | Partitions to be analyzed
(4 rows)
trino> SELECT * FROM system.metadata.column_properties;
 catalog_name |           property_name            | default_value |      type      |                                                                                            description                                                                     >
--------------+------------------------------------+---------------+----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------->
 hive         | partition_projection_digits        |               | integer        | Number of digits to be used with integer column projection                                                                                                                 >
 hive         | partition_projection_format        |               | varchar        | Date column projection format                                                                                                                                              >
 hive         | partition_projection_interval      |               | integer        | Integer column projection range interval                                                                                                                                   >
 hive         | partition_projection_interval_unit |               | varchar        | Date column projection range interval unit. Possible values: [NANOS, MICROS, MILLIS, SECONDS, MINUTES, HOURS, HALF_DAYS, DAYS, WEEKS, MONTHS, YEARS, DECADES, CENTURIES, MI>
 hive         | partition_projection_range         |               | array(varchar) | Column projection range, applicable for date and integer projection type                                                                                                   >
 hive         | partition_projection_type          |               | varchar        | Type of partition projection. Possible values: [ENUM, INTEGER, DATE, INJECTED]                                                                                             >
 hive         | partition_projection_values        |               | array(varchar) | Enum column projection values                                                                                                                                              >
(7 rows)
trino> SELECT * FROM system.metadata.materialized_view_properties;
 catalog_name |      property_name       | default_value |      type      |                           description
--------------+--------------------------+---------------+----------------+------------------------------------------------------------------
 iceberg      | format                   | ORC           | varchar        | File format for the table. Possible values: [ORC, PARQUET, AVRO]
 iceberg      | format_version           | 2             | integer        | Iceberg table format version
 iceberg      | location                 |               | varchar        | File system location URI for the table
 iceberg      | orc_bloom_filter_columns | []            | array(varchar) | ORC Bloom filter index columns
 iceberg      | orc_bloom_filter_fpp     | 0.05          | double         | ORC Bloom filter false positive probability
 iceberg      | partitioning             | []            | array(varchar) | Partition transforms
 iceberg      | storage_schema           |               | varchar        | Schema for creating materialized view storage table
(7 rows)
trino> SELECT * FROM system.metadata.schema_properties;
 catalog_name | property_name | default_value |  type   |                             description
--------------+---------------+---------------+---------+---------------------------------------------------------------------
 delta        | location      |               | varchar | URI for the default location to store tables created in this schema
 hive         | location      |               | varchar | Base file system location URI
 iceberg      | location      |               | varchar | Base file system location URI
(3 rows)
trino> SELECT * FROM system.metadata.table_properties;
 catalog_name |             property_name              | default_value |      type      |                                                          description
--------------+----------------------------------------+---------------+----------------+-------------------------------------------------------------------------------------------------------------------------------
 blackhole    | distributed_on                         | []            | array(varchar) | Distribution columns
 blackhole    | field_length                           | 16            | integer        | Overwrite default length (16) of variable length columns, such as VARCHAR or VARBINARY
 blackhole    | page_processing_delay                  | 0.00s         | varchar        | Sleep duration before processing each page
 blackhole    | pages_per_split                        | 0             | integer        | Number of pages per each split generated by this table
 blackhole    | rows_per_page                          | 0             | integer        | Number of rows per each page generated by this table
 blackhole    | split_count                            | 0             | integer        | Number of splits generated by this table
 delta        | checkpoint_interval                    |               | bigint         | Checkpoint interval
 delta        | location                               |               | varchar        | File system location URI for the table
 delta        | partitioned_by                         | []            | array(varchar) | Partition columns
 hive         | auto_purge                             | false         | boolean        | Skip trash when table or partition is deleted
 hive         | avro_schema_url                        |               | varchar        | URI pointing to Avro schema for the table
 hive         | bucket_count                           | 0             | integer        | Number of buckets
 hive         | bucketed_by                            | []            | array(varchar) | Bucketing columns
 hive         | bucketing_version                      |               | integer        | Bucketing version
 hive         | csv_escape                             |               | varchar        | CSV escape character
 hive         | csv_quote                              |               | varchar        | CSV quote character
 hive         | csv_separator                          |               | varchar        | CSV separator character
 hive         | external_location                      |               | varchar        | File system location URI for external table
 hive         | format                                 | ORC           | varchar        | Hive storage format for the table. Possible values: [ORC, PARQUET, AVRO, RCBINARY, RCTEXT, SEQUENCEFILE, JSON, TEXTFILE, CSV]
 hive         | null_format                            |               | varchar        | Serialization format for NULL value
 hive         | orc_bloom_filter_columns               | []            | array(varchar) | ORC Bloom filter index columns
 hive         | orc_bloom_filter_fpp                   | 0.05          | double         | ORC Bloom filter false positive probability
 hive         | partition_projection_enabled           |               | boolean        | Enable AWS Athena partition projection
 hive         | partition_projection_ignore            |               | boolean        | Disable AWS Athena partition projection in Trino only
 hive         | partition_projection_location_template |               | varchar        | Partition projection location template
 hive         | partitioned_by                         | []            | array(varchar) | Partition columns
 hive         | skip_footer_line_count                 |               | integer        | Number of footer lines
 hive         | skip_header_line_count                 |               | integer        | Number of header lines
 hive         | sorted_by                              | []            | array(varchar) | Bucket sorting columns
 hive         | textfile_field_separator               |               | varchar        | TEXTFILE field separator character
 hive         | textfile_field_separator_escape        |               | varchar        | TEXTFILE field separator escape character
 hive         | transactional                          |               | boolean        | Table is transactional
 iceberg      | format                                 | ORC           | varchar        | File format for the table. Possible values: [ORC, PARQUET, AVRO]
 iceberg      | format_version                         | 2             | integer        | Iceberg table format version
 iceberg      | location                               |               | varchar        | File system location URI for the table
 iceberg      | orc_bloom_filter_columns               | []            | array(varchar) | ORC Bloom filter index columns
 iceberg      | orc_bloom_filter_fpp                   | 0.05          | double         | ORC Bloom filter false positive probability
 iceberg      | partitioning                           | []            | array(varchar) | Partition transforms
 raptor       | bucket_count                           |               | integer        | Number of buckets into which to divide the table
 raptor       | bucketed_on                            | []            | array(varchar) | Table columns on which to bucket the table
 raptor       | distribution_name                      |               | varchar        | Shared distribution name for colocated tables
 raptor       | ordering                               | []            | array(varchar) | Sort order for each shard of the table
 raptor       | organized                              |               | boolean        | Keep the table organized using the sort order
 raptor       | temporal_column                        |               | varchar        | Temporal column of the table
 sqlserver    | data_compression                       |               | varchar        | DataCompression type for table. Possible values: [NONE, ROW, PAGE]
(45 rows)

Compared to system.metadata.catalogs it works as expected:

trino> SELECT * FROM system.metadata.catalogs;
 catalog_name | connector_id | connector_name
--------------+--------------+----------------
 system       | system       | system
(1 row)

huberty89 avatar Sep 06 '22 06:09 huberty89

@huberty89 can you please add a checklist in issue description, saying explicitly which tables are known to be affected?

findepi avatar Sep 09 '22 15:09 findepi

@huberty89 can you please add a checklist in issue description, saying explicitly which tables are known to be affected?

Added

huberty89 avatar Sep 19 '22 16:09 huberty89