paimon
paimon copied to clipboard
[WIP] Introduce RowTypeProjection for ReadBuilder
Purpose
public API
public interface ReadBuilder extends Serializable {
ReadBuilder withRowTypeProjection(RowTypeProjection rowTypeProjection);
}
public class RowTypeProjection {
public static RowTypeProjection from(RowType rowType);
}
inner api
public class RowTypeProjection {
public int[] toTopLevelProjection(RowType rowType);
// skipProjectTopLevel is introduced for compatible with the existing withProjection
public RowType project(RowType rowType, boolean skipProjectTopLevel);
}
how to use
RowType writeType =
DataTypes.ROW(
DataTypes.FIELD(0, "pt", DataTypes.INT()),
DataTypes.FIELD(1, "a", DataTypes.INT()),
DataTypes.FIELD(2, "f0", DataTypes.INT()),
DataTypes.FIELD(
3,
"f1",
DataTypes.ROW(
DataTypes.FIELD(4, "f0", DataTypes.INT()),
DataTypes.FIELD(5, "f1", DataTypes.INT()),
DataTypes.FIELD(6, "f2", DataTypes.INT()))));
// write
// GenericRow.of(0, 0, 0, GenericRow.of(10, 11, 12))
RowType readType =
DataTypes.ROW(
DataTypes.FIELD(
3,
"f1",
DataTypes.ROW(
DataTypes.FIELD(4, "f0", DataTypes.INT()),
DataTypes.FIELD(6, "f2", DataTypes.INT()))));
RowTypeProjection rowTypeProjection = RowTypeProjection.from(readType);
ReadBuilder readBuilder = table.newReadBuilder().withRowTypeProjection(RowTypeProjection.from(readType));
// read
// GenericRow.of(GenericRow.of(10, 12))
Tests
API and Format
Documentation
We just need a pruneColumns(RowType requiredSchema).
RowType contains all the information (field name, field id, nested structure ... ), it can replace projection
The final API will be modified to like this
@Deprecated
default ReadBuilder withProjection(int[] projection) {
// projection -> requiredSchema
return pruneColumns(RowType requiredSchema);
}
ReadBuilder pruneColumns(RowType requiredSchema);
@JingsongLi Thanks for review, updated