dataframe
dataframe copied to clipboard
Generate constructor for schema interface
Support GenerateConstructor annotation on companion object of schema interface to generate default implementation of that interface and append overload for DataFrame:
@DataSchema
interface Record {
val a: Int
val b: Int
@GenerateConstructor
companion object
}
// region Generated Code
operator fun Record.Companion.invoke(a: Int, b: Int): Record =
object: Record {
override val a = a
override val b = b
}
fun DataFrame<Record>.append(vararg rows: Record) = concat(rows.asIterable().toDataFrame())
// endregion
// usage:
listOf(Record(1,2), Record(3,4))
.toDataFrame()
.append(Record(5,6))
.add("sum") { a + b }
Better design.
DataFrame API:
interface DataRowSchema
inline fun <reified T:DataRowSchema> dataFrameOf(vararg rows: T): DataFrame<T> = rows.asIterable().toDataFrame()
inline fun <reified T:DataRowSchema> DataFrame<T>.append(vararg rows: T): DataFrame<T> = concat(dataFrameOf(*rows))
User code:
@DataSchema
interface Record: DataRowSchema {
val a: Int
val b: Int
companion object
}
// region Generated code
operator fun Record.Companion.invoke(a: Int, b: Int): Record =
object: Record {
override val a = a
override val b = b
}
// endregion
// usage:
dataFrameOf(Record(1,2), Record(3,4))
.append(Record(5,6))
.add("sum") { a + b }
Can we somehow get rid of DataRowSchema?
In FIR, it's possible to add supertypes to annotated class, so at least it can disappear from user code.
And here is another example to be aware of
// User code:
@DataSchema
interface AnotherRecord: DataRowSchema {
val a: Int
val b: Int
companion object
}
@DataSchema
interface Record: DataRowSchema {
val a: List<AnotherRecord> // same as DataFrame<AnotherRecord>
val b: AnotherRecord // same as DataRow<AnotherRecord>
companion object
}
// usage:
dataFrameOf(Record(listOf(AnotherRecord(5,6)), AnotherRecord(5,6)))