feat: support create partition table for non REST catalog
fixes: #578
Hi, @FANNG1 Thanks for your contribution. The reason why we use UnboundPartitionSpec rather PartitionSpec is to simplify the usage of this method. PartitionSpec is bound to a schema, and spec id, field ids are supposed to be meaningful. These ids are not supposed to be passed by user, but by the catalog implementation.
Hi, @FANNG1 Thanks for your contribution. The reason why we use
UnboundPartitionSpecratherPartitionSpecis to simplify the usage of this method.PartitionSpecis bound to a schema, and spec id, field ids are supposed to be meaningful. These ids are not supposed to be passed by user, but by the catalog implementation.
I agree that PartitionSpec is bound to a schema which is not user friendly, but when building UnboundPartitionSpec, we need source column id not name, it seems not friendly too? How about Using a separate TableRequestBuilder to build the TableCreation, in which PartitionSpec is generated by a more user friendly interfaces, WDYT?
pub struct TableCreation {
/// The name of the table.
pub name: String,
/// The location of the table.
#[builder(default, setter(strip_option))]
pub location: Option<String>,
/// The schema of the table.
pub schema: Schema,
/// The partition spec of the table, could be None.
#[builder(default, setter(strip_option, into))]
pub partition_spec: Option<PartitionSpec>,
/// The sort order of the table.
#[builder(default, setter(strip_option))]
pub sort_order: Option<SortOrder>,
/// The properties of the table.
#[builder(default)]
pub properties: HashMap<String, String>,
}
The reason we use PartitionSpec was that some fields, such as spec id, partition field id should not be passed by user when creating a table. But I think you are right when a user wants to create partition spec of a table, it would be more nature to build against a schema. This maybe confusing to user because their spec id and partition field id will be lost, it seems that there is no type safe approach to ensure, and we need to add some doc to explain it.
Thanks @FANNG1 for this pr. However I think there are some prepartion work before we can actually finished this pr. If we can narrow down the goal of this pr to change type for
UnboundPartitonSpectoPartitionSpeconly, we can merge this.
@liurenjie1024 , thanks for your review, let's keep consistent with what means change type for UnboundPartitonSpec to PartitionSpec only. The PR consistent serval parts:
- Change
UnboundPartitonSpectoPartitionSpecin TableCreation - Use
PartitionSpecto build the table metadata - Integration tests about creating and showing partition table
- Examples about create partition table and read data from table.
What you mean is just keeping the first part?
Thanks @FANNG1 for this pr. However I think there are some prepartion work before we can actually finished this pr. If we can narrow down the goal of this pr to change type for
UnboundPartitonSpectoPartitionSpeconly, we can merge this.@liurenjie1024 , thanks for your review, let's keep consistent with what means
change type for UnboundPartitonSpec to PartitionSpec only. The PR consistent serval parts:
- Change
UnboundPartitonSpectoPartitionSpecin TableCreation- Use
PartitionSpecto build the table metadata- Integration tests about creating and showing partition table
- Examples about create partition table and read data from table.
What you mean is just keeping the first part?
Yes, I think keeping only part 1 is a good start. Add PartitionSpec is sth how similar to a transaction with two actions:
- Add partition spec
- Set default partition spec.
I think it's better to implement them in transaction api.