Python interface for daft.PartitionField and PartitionTransform
Is your feature request related to a problem?
Issue
Daft PartitionField and PartitionTransform (code) doesn't expose a Python interface to access its attribute. E.g. given a PartitionTransform object, there is no way to figure out what type of Transform it is. Similarly PartitionField object only exposes a PyField object which isn't enough to represent a Partition Field.
This issue blocks model conversion between Daft and DeltaCAT.
However this is not a high priority issue as it only blocks Daft -> DeltaCAT model conversion, while in Daft-DeltaCAT integration, only DeltaCAT -> Daft model conversion is needed in order for Daft to be able to read a DeltaCAT table. Related DeltaCAT PR
Describe the solution you'd like
I'd like a more complete Python interface for accessing attributes of PartitionTransform and PartitionField objects.
Describe alternatives you've considered
No response
Additional Context
No response
Would you like to implement a fix?
No
hi @MingshiPeng, just for some clarity, is the ask that for PartitionField, you want to access properties like those available on Field (name, dtype, ...).
something like:
pf: PartitionField
pf.name # "my_field"
pf.dtype # DataType.int64()
and similarly, for PartitionTransform do you need to figure out what variant it is? We recently added similar functionality to DataType. Would this suffice?
example:
pt: PartitionTransform
pt.is_identity()
pt.is_iceberg_bucket()
pt.is_iceberg_truncate()
pt.is_year()
pt.is_month()
pt.is_day()
pt.is_hour()
pt.is_void()
if pt.is_iceberg_bucket():
n_buckets = pt.num_buckets
if pt.is_iceberg_truncate():
width = pt.width
https://github.com/ray-project/deltacat/pull/527#discussion_r2053071969
Hi @universalmind303
For PartitionField - the name and dtype attributes of Field is already accessible here so I don't have request. My requests are for PartitionField to expose accessible interface for source_field and transform (see the PartitionField.__init__() pasted below where those 2 attributed are entered)
https://github.com/Eventual-Inc/Daft/blob/f1f425220b389e7116ddbf70b10268d85fc32ad4/daft/daft/init.pyi#L746-L751
For PartitionTransform, the example you shared is the exact feature I need, I wasn't aware that it existed, so no further request on the PartitionTransfrom side, thanks.