posterior
posterior copied to clipboard
Add Pareto-khat diagnostic
Pareto-khat diagnostic can be used in general for any Monte Carlo expectation estimate (not just importance sampling)
I'm drafting here different functions and their arguments. Some of these are based on the forthcoming PSIS paper revision
- [ ] Copy
gpdfit()
fromloo
package (just point estimates) - [ ] Add new
gpdpost()
with different marginal and joint posteriors, see https://github.com/stan-dev/loo/pull/188 - [ ] New
pareto_khat()
function with support for draws objects, with arguments- draws
- M=3sqrt(ndraws) or ndraws_tail=3sqrt(ndraws), number of draws in tail
- tail='right', can be ´right', 'left', 'both', in case of both returns max k
- r_eff=1, relative efficiency from MCMC, can be a scalar or vector value, or NULL to compute r_eff from draws
returns - k
- sigma (not probably needed, but useful for completeness)
- minimum sample size for reliable Pareto smoothed estimate given k
- k thresholds for reliable Pareto smoothed estimate given sample size and r_eff
- estimated Pareto smoothed estimate RMSE convergence rate from k
- estimated efficiency computed from k, S, and r_eff
- diagnostic print provides informative messages, how big sample size would be needed to for reliable estimates, and how many draws would be needed to halve RMSE
- [ ] New
ps_min_ss(k)
returns minimum sample size for reliable Pareto smoothed estimate - [ ] New
ps_khat_threshold(S)
function for computing khat-threshold for reliable Pareto smoothed estimate given sample size - [ ] New
ps_convergence_rate(k, S)
Function for computing the Pareto smoothed estimate RMSE convergence rate from k and S
@paul-buerkner can you help to name the functions?
pinging also @Ozan147
EDIT: edited option names EDIT: edited names EDIT: edited typo log -> sqrt
Draft of functions
#' Given Pareto-k computes the minimum sample size for reliable Pareto
#' smoothed estimate (to have small probability of large error)
ps_min_ss <- function(k) {
# Eq (11) in PSIS paper
10^(1/(1-max(0,k)))
}
#' Given sample size S computes khat threshold for reliable Pareto
#' smoothed estimate (to have small probability of large error)
ps_khat_threshold <- function(S) {
1-1/log10(S)
}
#' Given S and scalar or array of k's, compute the relative
#' convergence rate of PSIS estimate RMSE
ps_convergence_rate <- function(k, S) {
# allow array of k's
rate <- numeric(length(k))
# k<0 bounded distribution
rate[k<0] <- 1
# k>0 non-finite mean
rate[k>1] <- 0
# limit value at k=1/2
rate[k==0.5] <- 1-1/log(S)
# smooth approximation for the rest (see Appendix of PSIS paper)
ki <- (k>0 & k <1 & k != 0.5)
kk <- k[ki]
rate[ki] = pmax(0,(2*(kk-1)*S^(2*kk+1)+(1-2*kk)*S^(2*kk)+S^2)/((S-1)*(S-S^(2*kk))))
rate
}
#' Given S and scalar and k, form a diagnostic message string
pareto_k_diagmsg <- function(k, S) {
msg <- paste0('With k=', round(k,2), ' and S=', round(S,0), ':\n')
if (k > 1) {
msg <- paste0(msg,'All estimates are unreliable. If the distribution of ratios is bounded,\nfurther draws may improve the estimates, but it is not possible to predict\nwhether any feasible sample size is sufficient.')
} else {
if (k > ps_khat_threshold(S)) {
msg <- paste0(msg, 'S is too small, and sample size larger than ', round(ps_min_ss(k),0), ' is neeeded for reliable results.\n')
} else {
msg <- paste0(msg, 'To half the RMSE, approximately ', round(2^(2/ps_convergence_rate(k,S)),1), ' times bigger S is needed.\n')
}
}
if (k > 0.7 & k < 1) {
msg <- paste0(msg, 'Bias dominates RMSE, and the variance based MCSE is underestimated.\n')
}
message(msg)
}
After some discussion with @avehtari I've been working on cleaning up the functions added in @ozanstats work.
@paul-buerkner one thing to get your opinion on:
Would it make sense to aim to have this pareto_khat
function which returns a number of things (k, convergence rate, min ss, etc.), be used in summarise_draws
and then have multiple columns created in the summary? Or should functions generally return a single column (quantile2 being the exception)?
Both works. Could there perhaps be two functions? One that just returns khat and an extended version with diagnostics? I prefer that pareto_khat
just returns the khat value. After all, we do not automatically provide more info for rhat and friends either. But then there could be another function named, say, pareto_khat_extended
(please think about if you can come up with a better name), which has a multi column output as you mentioned. What do you think?
Maybe the alternative is to have a pareto_k_diagnostic_measures()
like default_summary_measures()
Ok, I'll try a few options and add some example outputs here
I've now implemented a preliminary version of the tail selection and was thinking about the use cases for "right" vs "left" vs "both" when assessing Monte Carlo draws, with h(theta) = theta, and r(theta) = 1.
Specifically I was wondering whether the tails should be fit on the constrained or unconstrained draws, and if the resulting convergence rates should be consistent. (Note that currently r_eff = 1
) .
For example, for the tau
variable from eight schools, I get different thresholds / convergence rates for tau
and log(tau)
, @avehtari is this expected and if so which should be used?
# first 20 draws of example_draws()
# left tail (does this even make sense for tau as it is lower-bounded?)
variable pareto_k_l min_ss_l khat_threshold_l convergence_rate_l S
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 tau -0.349 10 0.231 1 20
2 log_tau 0.0389 11.0 0.231 0.991 20
# right tail
variable pareto_k_r min_ss_r khat_threshold_r convergence_rate_r S
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 tau 0.478 82.6 0.231 0.740 20
2 log_tau 0.0670 11.8 0.231 0.983 20
Specifically I was wondering whether the tails should be fit on the constrained or unconstrained draws
Should be fit on the scale that is reported
and if the resulting convergence rates should be consistent
No. E.g. the tails of log transformed variable are much shorter.
I get different thresholds / convergence rates for tau and log(tau), @avehtari is this expected and if so which should be used?
It is expected and the one that is reported should be used.
left tail (does this even make sense for tau as it is lower-bounded?)
It does make sense, and bounded tails have k<0. But usually not that interesting to report, and that's why it would be fine to report the bigger of left and right (although that will cause some overestimation)
Thanks for the answers, the use is clearer now. Then I suggest that the default be tail = "both"
, so when it is used in summarise_draws()
for all variables, it covers bounded and unbounded variables without needing to keep track of the supports.
Then I suggest that the default be tail = "both"
I agree.
Closed with #283