Exception in documentation example of `toDataFrame`
In Operations/Create/DataFrame we have:
val df = students.toDataFrame {
// add column
"year of birth" from { 2021 - it.age }
// scan all properties
properties(maxDepth = 1) {
exclude(Score::subject) // `subject` property will be skipped from object graph traversal
preserve<Name>() // `Name` objects will be stored as-is without transformation into DataFrame
}
// add column group
"summary" {
"max score" from { it.scores.maxOf { it.value } }
"min score" from { it.scores.minOf { it.value } }
}
}
Executing this in a Kotlin Notebook cell results in: Exception while analyzing expression in (13,28) in Line_123.jupyter.kts
Line 13 refers to "max score" from { it.scores.maxOf { it.value } }
Gradle settings:
plugins {
kotlin("jvm") version "2.2.20-Beta1"
kotlin("plugin.dataframe") version "2.2.20-Beta1"
}
dependencies {
implementation("org.jetbrains.kotlinx:dataframe:1.0.0-Beta2")
testImplementation(kotlin("test"))
}
Hi!
I suspect this is due to this issue: https://github.com/Kotlin/dataframe/issues/1116. Notebooks have issues with statistics and explicitly-not-nullable types. There's a variant of 1.0.0-Beta2 that forces statistics to only be callable on non-null columns: 1.0.0-dev-7089. Could you try if that works?
The relevant issue can be tracked here: https://youtrack.jetbrains.com/issue/KT-76441/IllegalStateException-null-DefinitelyNotNullType-for-T-exception-while-analyzing-expression
With implementation("org.jetbrains.kotlinx:dataframe:1.0.0-dev-7089") there is no error and the output of students.toDataFrame {..}.print() is:
year of birth name age scores summary
0 2006 Name(firstName=Alice, lastName=Cooper) 15 [2 x 1] { max score:4, min score:3 }
1 2001 Name(firstName=Bob, lastName=Marley) 20 [1 x 1] { value:5 } { max score:5, min score:5 }
With notebooks being cached and multiple ways to specify library dependencies it is not always clear which version of DF being executed. Is there a way to determine the version of DF similar to LetsPlot.getInfo()?
actually yes, you can call dataFrameConfig.version :)
@Jolanrensen looks like we need to release Beta-3 and Beta-3-for-Notebooks
@Jolanrensen I've heard that you tested that it's fixed since new IDEA version, could you please confirm and close the issue if it's true
Yes and no.
It's still a case of https://github.com/Kotlin/dataframe/issues/1116 which will be fixed when K2 becomes the default backend for notebooks. This can be tested at the moment by enabling the registry flag kotlin.notebook.replCompilerMode.enabled and setting the kernel version to something like 0.17.0-754 in K2 mode. However, at the time of writing, even the nightly version of IntelliJ does not yet support this combination of settings fully yet.
So, if you plan to attach a notebook to your module, make sure your module uses the -n version of dataframe, like 1.0.0-Beta4n.