dataframe icon indicating copy to clipboard operation
dataframe copied to clipboard

Is there a solution for lead/lag/shift of values on a column?

Open tklinchik opened this issue 2 years ago • 2 comments

I'd like to lead/lag/shift values within a groupBy. I can't find any examples or API for that and therefore wanted to raise a question. I would expect a syntax to be similar to following (assuming the data set with incidents:

df.groupBy("Date").aggregate {
    lag("IncidentTime", 1) into "PrevIncidentTime"
}

tklinchik avatar Nov 28 '23 21:11 tklinchik

Hm, for now i can recommend this approach

df.groupBy("Date").aggregate {
    "IncidentTime" into "IncidentTime"
}.add("PrevIncidentTime") { prev()?.getValue<LocalDateTime>("IncidentTime") }

What about offset, do you need something other than 1 previous / next? Would be interesting to know or have references so that we can add / extend API

koperagen avatar Nov 29 '23 10:11 koperagen

Adding this capability would be great. I've used Pandas heavily in the past and as I'm making a transition to Kotlin data frame API few things are missing and this is one of the more involved ones. In Pandas they have shift(periods) that can be positive or negative and other APIs I've used have lead or lag options.

Here is a Pandas example with a group by and a shift but I assume in Kotlin data frame API it might be more natural to express shift within aggregate lambda.

df['prev_value'] = df.groupby('object')['value'].shift(5)

taras-hillsidetec avatar Nov 29 '23 13:11 taras-hillsidetec