dataframe icon indicating copy to clipboard operation
dataframe copied to clipboard

columnOf does not consume Iterable<Any> correctly

Open holgerbrandl opened this issue 2 years ago • 1 comments

While we enjoy vararg constructors/builders, these rarely apply to data-frames because of the number of records. Currently, the following snippet works but fails to match intuition:

val data = listOf(1,2,3,5,6) // typically not created with listOf but rather the result of some computation

val df = dataFrameOf(columnOf(data).named("age"))

The column type is List although intuition suggests Int. Clearly a *someIterablee.asTypedArray() would fix that, but the * operator is not something most new adopters of kotlin are aware of. However, as an iterable of objects seems the most common starting point to build a data-frame programmatically, it would be great if the API could support that. Currently, this collides with the other vararg signatures I think, but could be overcome by providing a dedicated vararg constructor as in

// vararg for those who need/want/insist on vararg
public inline fun <reified T> columnOf(val first: T, vararg values: T)

// cover most real-world usecases
public inline fun <reified T> columnOf(val first: Iterable<T>)

the possible conflict with the other generic implementations of columnOf could be potentially resolved internally via reflection. Also, having Iterable<AnyBaseCol> or Iterable<DataFrame> as argument of columOf might be needed internally, but it's unlikely the primary way users want to consume the API (which IMHO is clearly Iterable<SomthingElseSuchAsStringIntOrAny>.

holgerbrandl avatar Sep 27 '22 08:09 holgerbrandl

Hmm, interesting thought, however, I'd say the current implementation is more in line with what is expected to happen. If we'd go by your approach columnOf(1) == columnOf(listOf(1)), which doesn't really seem right to me. Say someone does want to create a column with a single list, how would that then work? Plus if you want to create a column from your data in a way you expect, we also have the Iterable<T>.toColumn() extension function:

val data = listOf(1,2,3,5,6)
val df = dataFrameOf(data.toColumn(name = "age"))

This indicates more of a conversion (thanks to "to"), while columnOf() suggests that the argument is simply put into a column, like how it's done at the moment. You could still argue about a columnFrom(someIterable) but that would need some more thinking.

Jolanrensen avatar Sep 30 '22 11:09 Jolanrensen

Gonna close this. Feel free to reopen if you have something to add :)

Jolanrensen avatar Dec 20 '22 13:12 Jolanrensen

Sry for not coming back here after your kind and helpful comment.

I think I simply missed toColumn in the first place, and it works pretty well. Because I stumbled over columnOf in the first place, I hope it was still somehow useful feedback from a new user perspective.

holgerbrandl avatar Dec 20 '22 16:12 holgerbrandl