`roommate.checkPreferences` overhead
Thanks very much for this package!
I am running roommate(utils=utils) 1000s of times with changing utils but small population size (N < 1000). So, any overhead adds up.
For small N, the overhead is actually substantial, and I traced it to roommate.checkPreferences. I wonder if this check could be skipped using a flag to roommate. In fact, when pref is generated from utils by sortIndexOneSided(as.matrix(utils)), I wonder if checkPreferences can always be skipped? Since it seems sortIndexOneSided returns prefs as needed in cpp.
I thought the bottleneck in checkPreferences was the loop to create comp, but it seems not (I guess it is the sorting). Still, I think comp can be generated more easily by:
comp <- array(seq_len(NROW(pref)), dim = dim(pref)) - upper.tri(pref)
MWE & benchmarking
library('matchingR')
library('microbenchmark')
irving = utils::getFromNamespace(ns='matchingR','cpp_wrapper_irving')
validate = utils::getFromNamespace(ns='matchingR','roommate.validate')
checkPref = utils::getFromNamespace(ns='matchingR','roommate.checkPreferences')
gen.utils = function(n,seed=NULL){
set.seed(seed)
x = rnorm(n)
u = dnorm(outer(x,x,`-`))
}
my.validate = function(u){
n = ncol(u)
sortIndexOneSided(matrix(u[-seq(1,n^2,n+1)],n-1,n))
}
my.checkpref <- function(pref) {
# check if pref is using R instead of C++ indexing
if (all(apply(rbind(pref, seq_len(NCOL(pref))), 2, sort) ==
matrix(seq_len(NCOL(pref)), nrow = NCOL(pref), ncol = NCOL(pref)))) {
return(pref - 1)
}
comp <- array(seq_len(NROW(pref)), dim = dim(pref)) - upper.tri(pref)
# check if pref has a complete listing otherwise given an error
if (all(apply(pref, 2, sort) == comp)) {
return(pref)
}
return(NULL)
}
microbenchmark(
# validate(u),my.validate(u),my.checkpref(my.validate(u)), # VALIDATION
roommate(u=u),irving(validate(u))+1,irving(my.validate(u))+1, # FULL MODEL
setup={u=gen.utils(100,seed=0)},
check='equal',
times=10)
output
Loading required package: Rcpp
Unit: microseconds
expr min lq mean median uq max neval
roommate(u = u) 5937.956 6142.166 6575.959 6349.9525 6501.501 8439.542 10
irving(validate(u)) + 1 6113.019 6245.212 6938.835 6516.9050 7663.952 8583.290 10
irving(my.validate(u)) + 1 855.716 866.679 1161.646 884.0885 958.542 3303.109 10
Thanks for opening this issue!
In fact, when pref is generated from utils by sortIndexOneSided(as.matrix(utils)), I wonder if checkPreferences can always be skipped?
I think this makes sense.
In addition, I'm also fine with adding a validate argument to roommate(...) that defaults to TRUE.
Would you be interested in opening a PR?
Great - thanks! Yes, I will open a PR, though I'm a bit busy just now so might not be right away.