nimble icon indicating copy to clipboard operation
nimble copied to clipboard

use of `na.rm` in `max` in nf code gives untrapped error

Open paciorek opened this issue 1 year ago • 3 comments

The presence of a second arg seems to cause the compiler to try to use pairmax which causes problems. See about error trapping the use of na.rm and possible enhanced documentation of DSL use of max.

nsp <- 10
nsite <- 20
max_period <- round(runif(nsite, 2, 10))
max_plot <- array( round(runif( nsite * max(max_period, na.rm = TRUE), 2, 6)),
                   dim = c( nsite, max(max_period, na.rm = TRUE)))
# 4D array of expected abundances - species x site x survey period x plot within site
lambda <- array( runif( nsp * nsite * max(max_period, na.rm = TRUE) * max(max_plot, na.rm = TRUE), 0.2, 5),
                 dim = c( nsp, nsite, max(max_period, na.rm = TRUE), max(max_plot, na.rm = TRUE)))

Runique <- nimbleRcall(
  function( x = double( 4 )){},
  Rfun = 'unique',
  returnType =  double(2))

calc_site_richness <- nimbleFunction(
  run = function(
    lambda = double(4),
    nsp = integer(0),
    nsite = integer(0),
    max_period = integer(1),
    max_plot = integer(2)){
   
    species_id <- array( NA, dim = c( nsp, nsite, max(max_period, na.rm = TRUE), max(max_plot, na.rm = TRUE)))
    richness_site <- array( NA, dim = c(nsite, max(max_period, na.rm = TRUE)))
    for( s in 1:nsite) {
      for( t in 1:max_period[s]){
        richness_site[s, t] <- length( Runique( species_id[ 1:nsp, s, t, 1:max_plot[s, t]]))
        for( j in 1:max_plot[s, t]){
          for( h in 1:nsp){
            if( lambda[ h, s, t, j] > 1){
              species_id[ h, s, t, j] <- h
            }
          }
        }
      }
    }
    return(richness_site)
    returnType(double(2))
  }
)

# works
calc_site_richness( lambda = lambda,
                    nsp = nsp,
                    nsite = nsite,
                    max_period = max_period,
                    max_plot = max_plot)

# Won't compile - cryptic error:
# Error in parse(text = nimDeparse(code), keep.source = FALSE) :
#   <text>:1:9: unexpected '='
# 1: pairmax(=
#              ^
C_calc_site_richness <- compileNimble( calc_site_richness )

paciorek avatar Apr 12 '23 01:04 paciorek

The issue is that the minMaxHandler substitutes pair if there are two args.

Perhaps simply check for named args and error out if they occur.

paciorek avatar Nov 27 '23 00:11 paciorek

I can deal with this issue in minMaxHandler, but I noticed slightly troubling inconsistency:

  • min and max behave in compiled nimble code as if na.rm=TRUE, ignoring NAs.
  • mean and prod and sd in compiled nimble code behave as if na.rm=FALSE, returning NA if there are any NAs present.

On initial glance for min and max, it looks like Eigen's NaNPropagation may control the behavior.

I don't really want to get bogged down in this, so I suppose I may just ignore this inconsistency given that in modeling we wouldn't expect NAs. We could add a note to Table 11.3 noting the NA behavior for these various functions.

@perrydv @danielturek if you have thoughts, please let me know.

paciorek avatar Nov 28 '23 02:11 paciorek

Modifying minMaxHandler to error out (with an informative message) in the presence of any named arguments sounds like a sound first step. The error message could even mention "in particular, the min and max functions in compiled nimble code do not support the na.rm argument.

The inconsistencies between min/max and mean/prod/sd feels more worrisome. Without you getting bogged down in it, it would be nice if all these behaved the same, and perhaps universally did not ignore NA's. I take it it's not a simple flag to try setting in Eigen?

Short of that, yes, I guess documenting how it actually does behaving might be next best. If you could easily get compiled min and max to not ignore NA's, then at least that would be an improvement, although still inconsistent between compiled and uncompiled behaviour.

Below is the quick test I put together.

library(nimble)

nfDef <- nimbleFunction(
    setup = function() {},
    run = function() {},
    methods = list(
        myMin = function(x = double(1)) {
            returnType(double())
            return(min(x))
        },
        myMax = function(x = double(1)) {
            returnType(double())
            return(max(x))
        },
        myMean = function(x = double(1)) {
            returnType(double())
            return(mean(x))
        },
        myProd = function(x = double(1)) {
            returnType(double())
            return(prod(x))
        },
        mySD = function(x = double(1)) {
            returnType(double())
            return(sd(x))
        }
    )
)

Rnf <- nfDef()
Cnf <- compileNimble(Rnf)

x <- 1:5
Rnf$myMin(x);  Cnf$myMin(x)
Rnf$myMax(x);  Cnf$myMax(x)
Rnf$myMean(x); Cnf$myMean(x)
Rnf$myProd(x); Cnf$myProd(x)
Rnf$mySD(x);   Cnf$mySD(x)

x <- c(1, 2, NA, 4, 5)
Rnf$myMin(x);  Cnf$myMin(x)
Rnf$myMax(x);  Cnf$myMax(x)
Rnf$myMean(x); Cnf$myMean(x)
Rnf$myProd(x); Cnf$myProd(x)
Rnf$mySD(x);   Cnf$mySD(x)

danielturek avatar Nov 28 '23 15:11 danielturek