mediation Error in stats::model.frame when specifying an lm formula in a function argument, and boot = T

This works fine with boot=F, but throws an error when boot=T. Seems related to #29

library(mediation)
#> mediation: Causal Mediation Analysis
#> Version: 4.5.0
test <- function(f1, f2, boot){
    mtcars$mpg <- mtcars$mpg/max(mtcars$mpg)
    m_med <- lm(f1, mtcars)
    m_out <- lm(f2, mtcars)
    mediate(m_med, m_out, treat = "wt", mediator = 'am', boot = boot)
}

x <- test('am ~ wt', 'mpg ~ wt + am', F)
#success!
x <- test('am ~ wt', 'mpg ~ wt + am', T)
#> Running nonparametric bootstrap
#> Error in stats::model.frame(formula = f1, data = structure(list(am = c(1, : object 'f1' not found

Jun 03 '21 07:06 pinusm

Hi @pinusm, I had a similar issue. I wonder whether you have found any solutions? If yes, would you mind sharing with us?

Aug 02 '23 06:08 ianaianawong

Maybe I'll explain my problem in detail below.

Looks like the two posts related to this problem have not been answered before, so I'm tagging the contributors to the mediation package to attract more attention. My apologies if this tagging is a spam to you. @mdtrinh @weihuangwong @kosukeimai @k-hirose @teppeiyamamoto @ck37 @mspeekenbrink @kevinbrosnan @davraam @christopherkenny @GregFa

I'm using the mediation package in r to analyze the mediating role of 5 mediators between 1 predictor and 1 outcome variable. All variables are continuous. When boot=FALSE, the coding below worked. However, when boot=TRUE, I got the error message below.

olsols <- mediations(datasets, treatment, mediators, outcome, families=c("gaussian","gaussian"), 
                         interaction=FALSE, conf.level=.90, sims=1000, boot=TRUE, boot.ci.type="bca")

Error in stats::model.frame(formula = f1, data = list(IRQ_original_total_mi = c(79,  : 
  object 'f1' not found

Can anyone please advise how can I solve the error?

Before using the mediations function, I did multiple imputations using mice package to handle the missing data.

I've tried to look for similar problems reported by previous researchers, but I'm afraid I can only find one relevant post (this post that I'm replying to). However, there weren't any responses to that post so I decided to create a new post here.

Here is part of the dataset: https://drive.google.com/file/d/11rv1iniNjw6RpgqnYHdhD3FLy_hZbKNi/view?usp=sharing

Here are the codes I used:

## create a dataframe for the multiple imputation ====
    alldata4 <- alldata3[ ,c('ECR_1', 'ECR_2', 'ECR_3', 'ECR_4', 'ECR_5', 'ECR_6',
                             'ECR_7', 'ECR_8', 'ECR_9', 'ECR_10', 'ECR_11', 'ECR_12',  
                             'ECR_13', 'ECR_14', 'ECR_15', 'ECR_16', 'ECR_17', 'ECR_18', 
                             'ECR_19', 'ECR_20', 'ECR_21', 'ECR_22', 'ECR_23', 'ECR_24', 
                             'ECR_25', 'ECR_26', 'ECR_27', 'ECR_28', 'ECR_29', 'ECR_30', 
                             'ECR_31', 'ECR_32', 'ECR_33', 'ECR_34', 'ECR_35', 'ECR_36', 
                             'IRQ_1', 'IRQ_2', 'IRQ_3', 'IRQ_4', 'IRQ_5', 'IRQ_6', 
                             'IRQ_7', 'IRQ_8', 'IRQ_9', 'IRQ_10', 'IRQ_11', 'IRQ_12', 
                             'IRQ_13', 'IRQ_14', 'IRQ_15', 'IRQ_16', 'IRQ_17', 'IRQ_18', 
                             'IRQ_19', 'IRQ_20', 'IRQ_21', 'IRQ_22', 'IRQ_23', 'IRQ_24', 
                             'IRQ_25', 'IRQ_26', 'IRQ_27', 'IRQ_28', 'IRQ_29', 'IRQ_30', 'IRQ_31',"IRQ_32",
                             'CEER_1', 'CEER_2', 'CEER_3', 'CEER_4','CEER_5','CEER_6',
                             'CEER_7','CEER_8','CEER_9','CEER_10','CEER_11','CEER_12',
                             'IRIS_1', 'IRIS_2', 'IRIS_3', 'IRIS_4','IRIS_5','IRIS_6',
                             'IRIS_7','IRIS_8','IRIS_9','IRIS_10','IRIS_11','IRIS_12',
                             'IRIS_13', 'IRIS_14', 'IRIS_15', 'IRIS_16', 'IRIS_17', 'IRIS_18', 
                             'IRIS_19', "IRIS_20", "IRIS_21", "IRIS_22", "IRIS_23", "IRIS_24", 
                             'IRIS_25', 'IRIS_26', 'IRIS_27', 'IRIS_28',
                             'DERS_1', 'DERS_2', 'DERS_3', 'DERS_4','DERS_5','DERS_6',
                             'DERS_7','DERS_8','DERS_9','DERS_10','DERS_11','DERS_12',
                             'DERS_13', 'DERS_14', 'DERS_15', 'DERS_16',
                             'CTS2_1_R', 'CTS2_2_R', 'CTS2_3_R', 'CTS2_4_R', 'CTS2_5_R', 'CTS2_6_R', 
                             'CTS2_7_R', 'CTS2_8_R', 'CTS2_9_R', 'CTS2_10_R', 'CTS2_11_R', 'CTS2_12_R', 
                             'CTS2_13_R', 'CTS2_14_R', 'CTS2_15_R', 'CTS2_16_R', 'CTS2_17_R', 'CTS2_18_R', 
                             'CTS2_19_R', 'CTS2_20_R', 'CTS2_21_R', 'CTS2_22_R', 'CTS2_23_R', 'CTS2_24_R', 
                             'CTS2_25_R', 'CTS2_26_R', 'CTS2_27_R', 'CTS2_28_R', 'CTS2_29_R', 'CTS2_30_R', 
                             'CTS2_31_R', 'CTS2_32_R', 'CTS2_33_R', 'CTS2_34_R', 'CTS2_35_R', 'CTS2_36_R', 
                             'CTS2_37_R', 'CTS2_38_R', 'CTS2_39_R', 'CTS2_40_R', 'CTS2_41_R', 'CTS2_42_R', 
                             'CTS2_43_R', 'CTS2_44_R', 'CTS2_45_R', 'CTS2_46_R', 'CTS2_47_R', 'CTS2_48_R', 
                             'CTS2_49_R', 'CTS2_50_R', 'CTS2_51_R', 'CTS2_52_R', 'CTS2_53_R', 'CTS2_54_R',
                             'CARS_1_R','CARS_2_R', 'CARS_3_R', 'CARS_4_R', 'CARS_5_R', 'CARS_6_R', 
                             'CARS_7_R','CARS_8_R', 'CARS_9_R', 'CARS_10_R', 'CARS_11_R', 'CARS_12_R', 
                             'CARS_13_R','CARS_14_R', 'CARS_15_R', 'CARS_16_R','CARS_17_R','CARS_18_R', 
                             'CARS_19_R','CARS_20_R', 'CARS_21_R', 'CARS_22_R', 'CARS_23_R','CARS_24_R', 
                             'CARS_25_R','CARS_26_R', 'CARS_27_R', 'CARS_28_R', 'CARS_29_R','CARS_30_R', 
                             'CARS_31_R','CARS_32_R', 'CARS_33_R', 'CARS_34_R',  
                             'Relation_length_months', 'Relation_status', 'Income',
                             'Current_age', 'Gender', 'Aboriginal', 
                             'Commit_1', 'Commit_2', 'Commit_3','Commit_4', 'Commit_5','Commit_6','Commit_7',                           
                             'AUDIT_1','AUDIT_2', 'AUDIT_3', 'AUDIT_4', 'AUDIT_5', 'AUDIT_6',
                             'AUDIT_7', 'AUDIT_8', 'AUDIT_9',  'AUDIT_10')]

    ## multiple imputation 
       alldata4.mi <- mice::mice(alldata4, m = 5, method = 'pmm')
    
    ## obtain each dataset from the stacked dataset (the first one first)
        ECR.alldata4.mi.1 <- mice::complete(alldata4.mi, c(1))
        ECR.alldata4.mi.2 <- mice::complete(alldata4.mi, c(2))
        ECR.alldata4.mi.3 <- mice::complete(alldata4.mi, c(3))
        ECR.alldata4.mi.4 <- mice::complete(alldata4.mi, c(4))
        ECR.alldata4.mi.5 <- mice::complete(alldata4.mi, c(5))

## total score of ECR - anxiety after multiple imputation for each of the 5 imputed dataset 
ECR.alldata4.mi.1$ECR_anxiety_average_mi <- rowMeans(ECR.alldata4.mi.1[c("ECR_2", "ECR_4", "ECR_6", "ECR_8","ECR_10","ECR_12","ECR_14","ECR_16","ECR_18",
                                                                 "ECR_20","ECR_22","ECR_24","ECR_26","ECR_28","ECR_30","ECR_32","ECR_34","ECR_36")])

## total score of IRQ - original after multiple imputation for each of the 5 imputed dataset 
ECR.alldata4.mi.1$IRQ_original_total_mi <- rowSums(ECR.alldata4.mi.1[c("IRQ_1", "IRQ_3", "IRQ_5", "IRQ_7","IRQ_9","IRQ_11","IRQ_13","IRQ_15","IRQ_17",
                                                               "IRQ_19","IRQ_21","IRQ_23","IRQ_25","IRQ_27","IRQ_29","IRQ_31")])

#### I created the total scores of the variables included in the mediations code. I did not present all of them here to save space.

    # assign variables to their roles for the "mediations" function
    datasets <- list(ECR.alldata4.mi.1=ECR.alldata4.mi.1, ECR.alldata4.mi.2=ECR.alldata4.mi.2, 
                     ECR.alldata4.mi.3=ECR.alldata4.mi.3, 
                     ECR.alldata4.mi.4=ECR.alldata4.mi.4, ECR.alldata4.mi.5=ECR.alldata4.mi.5) # list of multiply imputed data sets
    
    mediators <- c("IRQ_original_total_mi", "IRQ_partner_total_mi", "CEER_total_mi", "IRIS_total_mi", "DERS_total_mi")
    outcome <- c("CTS2_psy_perp_total_mi")
    treatment <- c("ECR_anxiety_average_mi","ECR_anxiety_average_mi","ECR_anxiety_average_mi","ECR_anxiety_average_mi","ECR_anxiety_average_mi")  # note how the treatment indicator is repeated to match the number of datasets
    
    # mediation analysis
    olsols <- mediations(datasets, treatment, mediators, outcome, families=c("gaussian","gaussian"), 
                         interaction=FALSE, conf.level=.90, sims=1000, boot=TRUE, boot.ci.type="bca")

Aug 02 '23 06:08 ianaianawong

I'm sorry.. It's been a long time and I don't remember the context in which I encountered the problem, so I don't know where to look for to see if I've found a solution/workaround..

Aug 02 '23 07:08 pinusm

Hi @pinusm, no problem. Thank you for taking the time to let me know :)

Aug 02 '23 13:08 ianaianawong

Same as pinusm. I searched my computer to see if I could find anything, but came up empty.

Aug 02 '23 16:08 grasshoppermouse

Hi @grasshoppermouse ,no problem. Thank you for letting me know.

Aug 03 '23 03:08 ianaianawong

These looks like different problems. @pinusm's issue is about environment scoping, as lm stores the call as-is, which will have the formula as "f1", but f1 will not be defined in mediation's environment. That's a function of how stats::lm works. You can get around that with the usual evaluation trick with do.call.

library(mediation)
#> Loading required package: MASS
#> Loading required package: Matrix
#> Loading required package: mvtnorm
#> Loading required package: sandwich
#> mediation: Causal Mediation Analysis
#> Version: 4.5.0
test <- function(f1, f2, boot){
  mtcars$mpg <- mtcars$mpg/max(mtcars$mpg)
  m_med <- do.call(what = 'lm', list(formula = f1, data = mtcars))
  m_out <- do.call(what = 'lm', list(formula = f2, data = mtcars))
  mediate(m_med, m_out, treat = "wt", mediator = 'am', boot = boot)
}

x <- test('am ~ wt', 'mpg ~ wt + am', FALSE)
summary(x)
#> 
#> Causal Mediation Analysis 
#> 
#> Quasi-Bayesian Confidence Intervals
#> 
#>                Estimate 95% CI Lower 95% CI Upper p-value    
#> ACME            0.00111     -0.03294         0.04    0.94    
#> ADE            -0.15939     -0.20537        -0.12  <2e-16 ***
#> Total Effect   -0.15828     -0.18978        -0.13  <2e-16 ***
#> Prop. Mediated -0.00738     -0.24726         0.20    0.94    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Sample Size Used: 32 
#> 
#> 
#> Simulations: 1000
x <- test('am ~ wt', 'mpg ~ wt + am', TRUE)
#> Running nonparametric bootstrap
summary(x)
#> 
#> Causal Mediation Analysis 
#> 
#> Nonparametric Bootstrap Confidence Intervals with the Percentile Method
#> 
#>                 Estimate 95% CI Lower 95% CI Upper p-value    
#> ACME            0.000246    -0.033198         0.04    0.92    
#> ADE            -0.157900    -0.231674        -0.11  <2e-16 ***
#> Total Effect   -0.157654    -0.203822        -0.12  <2e-16 ***
#> Prop. Mediated -0.001560    -0.212690         0.20    0.92    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Sample Size Used: 32 
#> 
#> 
#> Simulations: 1000

^{Created on 2023-08-03 with reprex v2.0.2}

Aug 03 '23 20:08 christopherkenny

Hi @christopherkenny, thanks a lot for your help. I'll try out the codes later.

Aug 04 '23 06:08 ianaianawong

I have the same issue. When boot=TRUE, mediate() will fail with `Running nonparametric bootstrap for ordered outcome

Error in eval(mf, parent.frame()) : object 'f1' not found`

Mar 19 '24 03:03 wJDKnight

mediation mediation copied to clipboard

Error in stats::model.frame when specifying an lm formula in a function argument, and boot = T

mediation
mediation copied to clipboard