presto
presto copied to clipboard
[hive] Implement a new presto procedure to add existing partition location to metastore
The Presto Hive Connector currently has two procedures that allow users to sync existing partitions to the metastore or create new empty partitions. The limitation of the current procedures is that these can be only used if the partitions in the filesystem follow Hive's convention, i.e. all the partition directories should be inside the table location defined in the metastore and they should have a naming convention as follows - <table_location>/partition_column1=value1/partition_column2=value2/
and so on. Currently, if we want to add an existing partition that does not follow this convention or is located outside the table location, it is not possible to achieve this using Presto.
Expected Behavior or Use Case
Implement a new procedure using which a user can add an existing partition located at a custom location to the metastore.
Presto Component, Service, or Connector
Hive Connector
Possible Implementation
The arguments can be similar to the existing procedure for create_empty_partition
with the addition of location
system.add_partition(schema_name, table_name, partition_columns, partition_values, location)
You can start by looking at the implementation of create_empty_partition
procedure: https://github.com/prestodb/presto/blob/master/presto-hive/src/main/java/com/facebook/presto/hive/CreateEmptyPartitionProcedure.java
Context
As mentioned earlier, this would allow users to register existing partitions pointing to custom locations to the metastore using Presto rather than using external tools like hive.