IRanges
IRanges copied to clipboard
IRanges::IntegerList & min / max
Hi Michael @lawremi, Hervé @hpages,
A (recent?) change to IRanges IntegerList returns some unexpected results.
In base R, the min(integer()) returns Inf.
When taking the min or max of IRanges::IntegerList, you get the max integer
value in R (see below).
Should this match what base R returns?
-M
library(IRanges)
#> Loading required package: BiocGenerics
#> Loading required package: parallel
#>
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:parallel':
#>
#> clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
#> clusterExport, clusterMap, parApply, parCapply, parLapply,
#> parLapplyLB, parRapply, parSapply, parSapplyLB
#> The following objects are masked from 'package:stats':
#>
#> IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#>
#> anyDuplicated, append, as.data.frame, basename, cbind,
#> colMeans, colnames, colSums, dirname, do.call, duplicated,
#> eval, evalq, Filter, Find, get, grep, grepl, intersect,
#> is.unsorted, lapply, lengths, Map, mapply, match, mget, order,
#> paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind,
#> Reduce, rowMeans, rownames, rowSums, sapply, setdiff, sort,
#> table, tapply, union, unique, unsplit, which, which.max,
#> which.min
#> Loading required package: S4Vectors
#> Loading required package: stats4
#>
#> Attaching package: 'S4Vectors'
#> The following object is masked from 'package:base':
#>
#> expand.grid
il <- IntegerList(A = 1:3, B = integer())
max(il)
#> A B
#> 3 -2147483647
min(il)
#> A B
#> 1 2147483647
Created on 2018-08-08 by the reprex package (v0.2.0).
Session info
devtools::session_info()
#> Session info -------------------------------------------------------------
#> setting value
#> version R version 3.5.1 Patched (2018-07-30 r75013)
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> tz America/New_York
#> date 2018-08-08
#> Packages -----------------------------------------------------------------
#> package * version date source
#> backports 1.1.2 2017-12-13 CRAN (R 3.5.0)
#> base * 3.5.0 2018-05-01 local
#> BiocGenerics * 0.27.1 2018-06-17 Bioconductor
#> compiler 3.5.0 2018-05-01 local
#> datasets * 3.5.0 2018-05-01 local
#> devtools 1.13.6 2018-06-27 cran (@1.13.6)
#> digest 0.6.15 2018-01-28 CRAN (R 3.5.0)
#> evaluate 0.11 2018-07-17 CRAN (R 3.5.1)
#> graphics * 3.5.0 2018-05-01 local
#> grDevices * 3.5.0 2018-05-01 local
#> htmltools 0.3.6 2017-04-28 CRAN (R 3.5.0)
#> IRanges * 2.15.16 2018-07-18 Bioconductor
#> knitr 1.20 2018-02-20 CRAN (R 3.5.0)
#> magrittr 1.5 2014-11-22 CRAN (R 3.5.0)
#> memoise 1.1.0 2017-04-21 CRAN (R 3.5.0)
#> methods * 3.5.0 2018-05-01 local
#> parallel * 3.5.0 2018-05-01 local
#> Rcpp 0.12.18 2018-07-23 CRAN (R 3.5.1)
#> rmarkdown 1.10 2018-06-11 CRAN (R 3.5.0)
#> rprojroot 1.3-2 2018-01-03 CRAN (R 3.5.0)
#> S4Vectors * 0.19.19 2018-07-18 Bioconductor
#> stats * 3.5.0 2018-05-01 local
#> stats4 * 3.5.0 2018-05-01 local
#> stringi 1.2.4 2018-07-20 CRAN (R 3.5.1)
#> stringr 1.3.1 2018-05-10 CRAN (R 3.5.0)
#> tools 3.5.0 2018-05-01 local
#> utils * 3.5.0 2018-05-01 local
#> withr 2.1.2 2018-03-15 CRAN (R 3.5.0)
#> yaml 2.2.0 2018-07-25 CRAN (R 3.5.1)
I hit this problem the other day and will come up with a fix.
AFAIK this is what min(), max(), and range() have been doing on CompressedIntegerList objects for years.
Yes, it's definitely not new. The annoying thing is that Inf is real not integer so there is a type conversion of the result that only happens when there is a zero-length element. Simplest thing is to check for a zero-length element and coerce to NumericList before going down to C code. Not very efficient but simpler than hacking it in C.
@LiNk-NY @lawremi @LTLA In regard with what was discussed in issue #10, is it OK to close this issue? Thanks!
I don't have any strong feelings either way. I could imagine that the inconsistency with base R might cause problems to someone, but a function would need to explicitly protect against zero-length vectors for any behaviour of max. So the current behaviour is probably fine.
I'm OK with closing it, but it would be nice to have a discussion/rationale to point people to in the future.