presto icon indicating copy to clipboard operation
presto copied to clipboard

[hive] Implement a new presto procedure to add existing partition location to metastore

Open imjalpreet opened this issue 1 year ago • 0 comments

The Presto Hive Connector currently has two procedures that allow users to sync existing partitions to the metastore or create new empty partitions. The limitation of the current procedures is that these can be only used if the partitions in the filesystem follow Hive's convention, i.e. all the partition directories should be inside the table location defined in the metastore and they should have a naming convention as follows - <table_location>/partition_column1=value1/partition_column2=value2/ and so on. Currently, if we want to add an existing partition that does not follow this convention or is located outside the table location, it is not possible to achieve this using Presto.

Expected Behavior or Use Case

Implement a new procedure using which a user can add an existing partition located at a custom location to the metastore.

Presto Component, Service, or Connector

Hive Connector

Possible Implementation

The arguments can be similar to the existing procedure for create_empty_partition with the addition of location system.add_partition(schema_name, table_name, partition_columns, partition_values, location)

You can start by looking at the implementation of create_empty_partition procedure: https://github.com/prestodb/presto/blob/master/presto-hive/src/main/java/com/facebook/presto/hive/CreateEmptyPartitionProcedure.java

Context

As mentioned earlier, this would allow users to register existing partitions pointing to custom locations to the metastore using Presto rather than using external tools like hive.

imjalpreet avatar Feb 22 '24 20:02 imjalpreet