IRanges icon indicating copy to clipboard operation
IRanges copied to clipboard

IRanges::IntegerList & min / max

Open LiNk-NY opened this issue 6 years ago • 6 comments

Hi Michael @lawremi, Hervé @hpages,

A (recent?) change to IRanges IntegerList returns some unexpected results. In base R, the min(integer()) returns Inf.

When taking the min or max of IRanges::IntegerList, you get the max integer value in R (see below). Should this match what base R returns?

-M

library(IRanges)
#> Loading required package: BiocGenerics
#> Loading required package: parallel
#> 
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:parallel':
#> 
#>     clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
#>     clusterExport, clusterMap, parApply, parCapply, parLapply,
#>     parLapplyLB, parRapply, parSapply, parSapplyLB
#> The following objects are masked from 'package:stats':
#> 
#>     IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#> 
#>     anyDuplicated, append, as.data.frame, basename, cbind,
#>     colMeans, colnames, colSums, dirname, do.call, duplicated,
#>     eval, evalq, Filter, Find, get, grep, grepl, intersect,
#>     is.unsorted, lapply, lengths, Map, mapply, match, mget, order,
#>     paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind,
#>     Reduce, rowMeans, rownames, rowSums, sapply, setdiff, sort,
#>     table, tapply, union, unique, unsplit, which, which.max,
#>     which.min
#> Loading required package: S4Vectors
#> Loading required package: stats4
#> 
#> Attaching package: 'S4Vectors'
#> The following object is masked from 'package:base':
#> 
#>     expand.grid
il <- IntegerList(A = 1:3, B = integer())
max(il)
#>           A           B 
#>           3 -2147483647
min(il)
#>          A          B 
#>          1 2147483647

Created on 2018-08-08 by the reprex package (v0.2.0).

Session info
devtools::session_info()
#> Session info -------------------------------------------------------------
#>  setting  value                                      
#>  version  R version 3.5.1 Patched (2018-07-30 r75013)
#>  system   x86_64, linux-gnu                          
#>  ui       X11                                        
#>  language (EN)                                       
#>  collate  en_US.UTF-8                                
#>  tz       America/New_York                           
#>  date     2018-08-08
#> Packages -----------------------------------------------------------------
#>  package      * version date       source        
#>  backports      1.1.2   2017-12-13 CRAN (R 3.5.0)
#>  base         * 3.5.0   2018-05-01 local         
#>  BiocGenerics * 0.27.1  2018-06-17 Bioconductor  
#>  compiler       3.5.0   2018-05-01 local         
#>  datasets     * 3.5.0   2018-05-01 local         
#>  devtools       1.13.6  2018-06-27 cran (@1.13.6)
#>  digest         0.6.15  2018-01-28 CRAN (R 3.5.0)
#>  evaluate       0.11    2018-07-17 CRAN (R 3.5.1)
#>  graphics     * 3.5.0   2018-05-01 local         
#>  grDevices    * 3.5.0   2018-05-01 local         
#>  htmltools      0.3.6   2017-04-28 CRAN (R 3.5.0)
#>  IRanges      * 2.15.16 2018-07-18 Bioconductor  
#>  knitr          1.20    2018-02-20 CRAN (R 3.5.0)
#>  magrittr       1.5     2014-11-22 CRAN (R 3.5.0)
#>  memoise        1.1.0   2017-04-21 CRAN (R 3.5.0)
#>  methods      * 3.5.0   2018-05-01 local         
#>  parallel     * 3.5.0   2018-05-01 local         
#>  Rcpp           0.12.18 2018-07-23 CRAN (R 3.5.1)
#>  rmarkdown      1.10    2018-06-11 CRAN (R 3.5.0)
#>  rprojroot      1.3-2   2018-01-03 CRAN (R 3.5.0)
#>  S4Vectors    * 0.19.19 2018-07-18 Bioconductor  
#>  stats        * 3.5.0   2018-05-01 local         
#>  stats4       * 3.5.0   2018-05-01 local         
#>  stringi        1.2.4   2018-07-20 CRAN (R 3.5.1)
#>  stringr        1.3.1   2018-05-10 CRAN (R 3.5.0)
#>  tools          3.5.0   2018-05-01 local         
#>  utils        * 3.5.0   2018-05-01 local         
#>  withr          2.1.2   2018-03-15 CRAN (R 3.5.0)
#>  yaml           2.2.0   2018-07-25 CRAN (R 3.5.1)

LiNk-NY avatar Aug 08 '18 18:08 LiNk-NY

I hit this problem the other day and will come up with a fix.

lawremi avatar Aug 08 '18 21:08 lawremi

AFAIK this is what min(), max(), and range() have been doing on CompressedIntegerList objects for years.

hpages avatar Aug 09 '18 08:08 hpages

Yes, it's definitely not new. The annoying thing is that Inf is real not integer so there is a type conversion of the result that only happens when there is a zero-length element. Simplest thing is to check for a zero-length element and coerce to NumericList before going down to C code. Not very efficient but simpler than hacking it in C.

lawremi avatar Aug 09 '18 14:08 lawremi

@LiNk-NY @lawremi @LTLA In regard with what was discussed in issue #10, is it OK to close this issue? Thanks!

hpages avatar Mar 23 '19 20:03 hpages

I don't have any strong feelings either way. I could imagine that the inconsistency with base R might cause problems to someone, but a function would need to explicitly protect against zero-length vectors for any behaviour of max. So the current behaviour is probably fine.

LTLA avatar Mar 23 '19 22:03 LTLA

I'm OK with closing it, but it would be nice to have a discussion/rationale to point people to in the future.

lawremi avatar Mar 23 '19 23:03 lawremi