dataframe
dataframe copied to clipboard
columnOf does not consume Iterable<Any> correctly
While we enjoy vararg constructors/builders, these rarely apply to data-frames because of the number of records. Currently, the following snippet works but fails to match intuition:
val data = listOf(1,2,3,5,6) // typically not created with listOf but rather the result of some computation
val df = dataFrameOf(columnOf(data).named("age"))
The column type is List although intuition suggests Int. Clearly a *someIterablee.asTypedArray()
would fix that, but the *
operator is not something most new adopters of kotlin are aware of. However, as an iterable of objects seems the most common starting point to build a data-frame programmatically, it would be great if the API could support that. Currently, this collides with the other vararg signatures I think, but could be overcome by providing a dedicated vararg constructor as in
// vararg for those who need/want/insist on vararg
public inline fun <reified T> columnOf(val first: T, vararg values: T)
// cover most real-world usecases
public inline fun <reified T> columnOf(val first: Iterable<T>)
the possible conflict with the other generic implementations of columnOf
could be potentially resolved internally via reflection. Also, having Iterable<AnyBaseCol>
or Iterable<DataFrame>
as argument of columOf
might be needed internally, but it's unlikely the primary way users want to consume the API (which IMHO is clearly Iterable<SomthingElseSuchAsStringIntOrAny>
.
Hmm, interesting thought, however, I'd say the current implementation is more in line with what is expected to happen. If we'd go by your approach
columnOf(1) == columnOf(listOf(1))
, which doesn't really seem right to me. Say someone does want to create a column with a single list, how would that then work?
Plus if you want to create a column from your data
in a way you expect, we also have the Iterable<T>.toColumn()
extension function:
val data = listOf(1,2,3,5,6)
val df = dataFrameOf(data.toColumn(name = "age"))
This indicates more of a conversion (thanks to "to"), while columnOf()
suggests that the argument is simply put into a column, like how it's done at the moment. You could still argue about a columnFrom(someIterable)
but that would need some more thinking.
Gonna close this. Feel free to reopen if you have something to add :)
Sry for not coming back here after your kind and helpful comment.
I think I simply missed toColumn in the first place, and it works pretty well. Because I stumbled over columnOf in the first place, I hope it was still somehow useful feedback from a new user perspective.