dataframe icon indicating copy to clipboard operation
dataframe copied to clipboard

Split function should work with Pairs and data classes

Open belovrv opened this issue 2 years ago • 4 comments

It would be great if the split function could automatically work with pairs and other data classes

belovrv avatar Mar 20 '23 07:03 belovrv

It can be done like this, but let's discuss it before adding it to the library.

inline fun <reified T: Any> getDataClassPropertyValues(obj: T): List<Any?> {
    return T::class.memberProperties.map { it.get(obj) }
}
val df = dataFrameOf("objects")(
    A("1", "2"),
    A("11", "22"),
)

df.print()

df.split { "objects"<A>() }.by { getDataClassPropertyValues(it) }.inplace().print()
df.split { "objects"<A>() }.by { getDataClassPropertyValues(it) }.into("split").print()
df.split { "objects"<A>() }.by { getDataClassPropertyValues(it) }.inward("one", "two").print()

image

koperagen avatar Mar 22 '23 12:03 koperagen

Thinking about it, we can also research integration with kotlinx.serialization. Its serializers can provide required information. I wonder if it will be any better than using reflection in toDataFrame and other reflective operations like this proposed split clause

koperagen avatar Mar 22 '23 13:03 koperagen

Isn't it what unfold is for?

pacher avatar Mar 23 '23 13:03 pacher

split inward is indeed works like unfold, but there are a lot of different options :) I'm not sure how @belovrv wants to use it https://kotlin.github.io/dataframe/split.html

.into(columnNames) [ { columnNamesGenerator } ] | .inward(columnNames) [ { columnNamesGenerator } | .inplace() | .intoRows() | .intoColumns() ] // where to store results

koperagen avatar Mar 23 '23 13:03 koperagen