tidyr icon indicating copy to clipboard operation
tidyr copied to clipboard

`drop_na()` causes C stack overflow with `Surv` objects

Open abichat opened this issue 6 months ago • 0 comments

Hi,

Using drop_na() on a data frame column containing Surv objects from the {survival} package results in a C stack overflow error (C stack usage is too close to the limit).

library(dplyr)
library(tidyr)
library(survival)

df_surv <- 
  data.frame(time =  c(1, 5, NA, NA,  3, NA),
             event = c(0, 1,  1,  0, NA, NA)) %>% 
  mutate(surv = Surv(time, event)) 

filter(df_surv, !is.na(surv))
#>   time event surv
#> 1    1     0   1+
#> 2    5     1    5
filter(df_surv, is.na(surv))
#>   time event surv
#> 1   NA     1   NA
#> 2   NA     0  NA+
#> 3    3    NA   3?
#> 4   NA    NA  NA?
# drop_na(df_surv, surv) 
# Error: C stack usage  7957792 is too close to the limit

Created on 2025-04-09 with reprex v2.1.1

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.2 (2024-10-31)
#>  os       macOS Sequoia 15.3.2
#>  system   aarch64, darwin20
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Europe/Paris
#>  date     2025-04-09
#>  pandoc   3.2 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/aarch64/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  cli           3.6.3   2024-06-21 [1] CRAN (R 4.4.0)
#>  digest        0.6.37  2024-08-19 [1] CRAN (R 4.4.1)
#>  dplyr       * 1.1.4   2023-11-17 [1] CRAN (R 4.4.0)
#>  evaluate      1.0.3   2025-01-10 [1] CRAN (R 4.4.1)
#>  fastmap       1.2.0   2024-05-15 [1] CRAN (R 4.4.0)
#>  fs            1.6.5   2024-10-30 [1] CRAN (R 4.4.1)
#>  generics      0.1.3   2022-07-05 [1] CRAN (R 4.4.0)
#>  glue          1.8.0   2024-09-30 [1] CRAN (R 4.4.1)
#>  htmltools     0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0)
#>  knitr         1.49    2024-11-08 [1] CRAN (R 4.4.1)
#>  lattice       0.22-6  2024-03-20 [1] CRAN (R 4.4.2)
#>  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.4.0)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.4.0)
#>  Matrix        1.7-1   2024-10-18 [1] CRAN (R 4.4.2)
#>  pillar        1.10.1  2025-01-07 [1] CRAN (R 4.4.1)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.4.0)
#>  purrr         1.0.2   2023-08-10 [1] CRAN (R 4.4.0)
#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.4.0)
#>  reprex        2.1.1   2024-07-06 [1] CRAN (R 4.4.0)
#>  rlang         1.1.5   2025-01-17 [1] CRAN (R 4.4.1)
#>  rmarkdown     2.29    2024-11-04 [1] CRAN (R 4.4.1)
#>  rstudioapi    0.17.1  2024-10-22 [1] CRAN (R 4.4.1)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.4.0)
#>  survival    * 3.8-3   2024-12-17 [1] CRAN (R 4.4.1)
#>  tibble        3.2.1   2023-03-20 [1] CRAN (R 4.4.0)
#>  tidyr       * 1.3.1   2024-01-24 [1] CRAN (R 4.4.0)
#>  tidyselect    1.2.1   2024-03-11 [1] CRAN (R 4.4.0)
#>  vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.4.0)
#>  withr         3.0.2   2024-10-28 [1] CRAN (R 4.4.1)
#>  xfun          0.51.1  2025-02-20 [1] Github (yihui/xfun@dd2aaf1)
#>  yaml          2.3.10  2024-07-26 [1] CRAN (R 4.4.0)
#> 
#>  [1] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

As you can see, filter() + is.na() handles this correctly. Would it be possible to make it work with drop_na too?

Thanks, and thanks again for the great work on tidyr!

abichat avatar Apr 09 '25 13:04 abichat