funs
funs copied to clipboard
na.rm argument in lead and lag?
It sometimes happens that one wants the next/previous non-missing value. It would be nice if this was supported in lead/lag functions, e.g., through na.rm argument.
This is also specified in SQL standard with IGNORE NULLS option and could thus also be implemented for dbplyr (for dialects that support it) and sparklyr (though Spark currently does not support it directly, only through first/last with an appropriate range clause, afaik).
Is this really that widely implemented in SQL though? MS SQL, SQLite, and MariaDB all don't implement it. Snowflake does: https://docs.snowflake.com/en/sql-reference/functions/lead.html
But I feel like you can accomplish this pretty easily by first filling the column with missing values up/down depending on whether you are doing lead/lag. Which you probably want to do anyways if you are going to end up comparing the column with its lagged version, right?
library(dplyr)
library(vctrs)
df <- tibble(x = c(5, 4, NA, 2, NA, NA, 6))
df
#> # A tibble: 7 × 1
#> x
#> <dbl>
#> 1 5
#> 2 4
#> 3 NA
#> 4 2
#> 5 NA
#> 6 NA
#> 7 6
df %>%
mutate(x_fill = vec_fill_missing(x, "up")) %>%
mutate(x_lag = lead(x_fill, 1))
#> # A tibble: 7 × 3
#> x x_fill x_lag
#> <dbl> <dbl> <dbl>
#> 1 5 5 4
#> 2 4 4 2
#> 3 NA 2 2
#> 4 2 2 6
#> 5 NA 6 6
#> 6 NA 6 6
#> 7 6 6 NA
Created on 2022-07-16 by the reprex package (v2.0.1)