heatmap-plot

Open strengejacke opened this issue 6 years ago • 0 comments

Hi @strengejacke - apologies for my delay here. Yeah, I have some MWE for each. First, for the ePCP (expected proportion correctly predicted), we are looking for a higher value, bounded between 0 and 1, where 1 = a model that perfectly predicted all actual observations. Here is some sample code using mtcars:

library(OOmisc) # for ePCP
library(ggplot2)

logitmod <- glm(vs ~ mpg, data = mtcars); summary(logitmod)
logitmod2 <- glm(vs ~ mpg + cyl, data = mtcars); summary(logitmod2)

y <- mtcars$vs
pred1 <- predict(logitmod, type="response")
pred2 <- predict(logitmod2, type="response")

# ePCP, expected proportion of correct prediction
epcp1 <- ePCP(pred1, y, alpha = 0.05) # define observed values and obtain predicted values
epcp2 <- ePCP(pred2, y, alpha = 0.05)

epcpdata <- data.frame(rbind(epcp1, epcp2))
epcpdata$model <- c(1,2)
epcpdata$count <- factor(c(1,2), label = c("Model1","Model2"))

# Now the plot
ggplot(epcpdata, aes(x=model,y=ePCP,colour=count)) +
  geom_bar(position=position_dodge(), stat="identity", fill = "darkgray") +
  geom_errorbar(aes(ymin=lower, ymax=upper),
                width=.2,
                position=position_dodge(.9)) +
  labs(x="Model Specification",
       y="Expected Proportion of Correct Prediction",
       title = "Comparing ePCP between Model 1 and Model 2",
       colour="Model Specification") +
  theme_bw()

And next, for the heatmapFit, which is a little less flexible, the goal is to compare model predictions with smoothed empirical predictions. The 45-degree line references a perfect fit. Any deviance from that line suggests a loss in goodness of fit. The “p-value” legend shows if any deviance is statistically significant (dark color means statistical significance). Using the same models as an example from mtcars, it would look something like:

## heatmapfit
library(heatmapFit) # for heat maps

# quick rescale to bound predictions between 0 and 1
range <- function(x){
  (x-min(x))/(max(x)-min(x))
  }

pred1 <- range(pred1)
pred2 <- range(pred2)

heatmap.fit(y, pred1, reps = 1000, legend = TRUE)
heatmap.fit(y, pred2, reps = 1000, legend = TRUE)

** You can copy and paste this code to see the figures. Pretty clean, efficient renderings of model fit.

** And of course, ROC curves are always great, simple model fit checks. Let me know thoughts if you have them. Hope this helps a bit!

Originally posted by @pdwaggoner in https://github.com/easystats/performance/issues/38#issuecomment-487404535

May 24 '19 19:05 strengejacke