Support additional Number types (like Percentage)
I was experimenting with adding a wrapper around Doubles for percentages, since that's something many other table solutions support.
Something like this works for basic usage:
data class Percentage(val value: Double, val noDecimals: Int = 2) : Number(), Comparable<Percentage> {
override fun toByte(): Byte = value.toInt().toByte()
override fun toDouble(): Double = value
override fun toFloat(): Float = value.toFloat()
override fun toInt(): Int = value.toInt()
override fun toLong(): Long = value.toLong()
override fun toShort(): Short = value.toInt().toShort()
override fun compareTo(other: Percentage): Int = value.compareTo(other.value)
override fun toChar(): Char = value.toInt().toChar()
private fun Double.roundToNoDecimals(noDecimals: Int): Double =
round(this * 10.0.pow(noDecimals)) / 10.0.pow(noDecimals)
override fun toString(): String {
val percentage = value * 100.0
val double =
if (noDecimals <= 0) {
round(percentage).toLong()
} else {
percentage.roundToNoDecimals(noDecimals)
}
return "$double%"
}
}
fun Number.toPercentage(noDecimals: Int = 2): Percentage = Percentage(this.toDouble(), noDecimals)
However, since this is a Number, I was under the impression we supported converting it (among other things) right out of the gate. Unfortunately I was wrong:
To fully support any new Number implementation, we need:
- Decide on a "default" number type to convert unknowns to. I suggest
Double. - filled in
Numberin impl/convert.kt::createConverter - Check statistics for
Numbercases: https://github.com/Kotlin/dataframe/issues/558 - Column Arithmatics for
Numbercolumns
Hi @zaleslaw, @Jolanrensen I'd love to work on this issue! I've just completed a Kotlin module at university and am keen to contribute to real world projects.
I understand this involves extending support across converters, statistics, and arithmetic operations. This sounds like the perfect first open source kotlin issue!
Would you be able to assign this to me? I'm happy to start with a basic implementation and get feedback before tackling all the integration points.
Thanks! Emmanuel
Hi @AnosVoldygod! Thanks for your enthousiasm :) However, a lot has happened since I created this issue more than a year ago. We did a complete overhaul of statistics, limiting them to just primitive types (with Double as the most complex type), unification of numbers inside statistics and when reading from JSON.
Not only that, but we're in the process of releasing DataFrame 1.0 now. This means we're looking more into stabilizing, and removing stuff (like column arithmetics), rather than adding it. We're currently at beta2, which insinuates we're on a feature freeze until the eventual release. We can fix bugs and other essential stuff, but the API should stay the same.
Of course, I won't stop you from making a draft implementation :) you're more than welcome, however, if you want to have a more direct impact, I might advise to look at other "good first issues" like https://github.com/Kotlin/dataframe/issues/1098, https://github.com/Kotlin/dataframe/issues/890, https://github.com/Kotlin/dataframe/issues/1129, or https://github.com/Kotlin/dataframe/issues/1006 (I just went over our issues to mark a few as such. Actually, I would not consider this particular issue a "good first issue" at all). These might be small enough to be merged before the 1.0 release. That said, if you want to use DataFrame in your own project and you have a need for percentage numbers, of course you can do that too :) Just don't expect it to be merged until 1.1.
Let us know if we can assist you in anything and what to assign to you :) We also have a contribution guide if you're curious.
Hi @Jolanrensen , Thanks for the quick and detailed response! I'll admit, I had to spend more time than I'd like Googling and researching what you said, but I see where you are coming from! It makes sense that the architecture you've implemented (primitive data types) is not compatible with how this feature would work (wrapper object) and allowing this would likely open you up to more unpredictable errors. From my experience with engineering I think this is a great approach, 1.0 should be the baseline and most robust version for you to build on, so sacrificing a feature for stability is a great move!
Thanks for the contribution recommendations! I'd love to work on them! I'll comment on them, but I'm definitely keen to start with #1006: Parse String to UUID, and then work my way up to the more complex ones like #1129 and #890!