MicrobiomeStat icon indicating copy to clipboard operation
MicrobiomeStat copied to clipboard

generate_taxa_test_pair() issue with time.var = NULL in random effect

Open 16svale opened this issue 4 months ago • 5 comments

Dear developer, I am encountering an issue when using the generate_taxa_test_pair() function to test for random effect.

When I use the basic linda function as below it works:

linda.obj.all <- linda(asv_table[, ind], meta[ind, ], formula = '~SampleType1 + (1|SSF)', feature.dat.type = 'count', prev.filter = 0.1, mean.abund.filter = 0.004 , is.winsor = TRUE, outlier.pct = 0.03, p.adj.method = "BH", alpha = 0.05)

However, the generate_taxa_test_pair() that should work the same but used as below gives me an error: #create test.list test.list <- generate_taxa_test_pair( data.obj = data.obj, group.var = "sampletype1", adj.vars = NULL, subject.var = "SSF", time.var = NULL, feature.dat.type = "count", feature.level = "Genus", prev.filter = 0.1, abund.filter = 0.004)

Output: Rule 1 passed: data.obj is a list. Rule 2 passed: meta.dat is a data.frame. Rule 3 passed: The row names of feature.tab match the row names of feature.ann. Rule 4 passed: The column names of feature.tab match the row names of meta.dat. Rule 5 passed: feature.tab is a matrix. Rule 6 passed: feature.ann is a matrix. Please note: The data components should follow base R data.frame and matrix structures, not phyloseq's formal class. Validation passed. Note: Passing validation does not guarantee the absence of all data issues. Further data exploration may be needed. Your data is in raw format ('Raw'). Normalization is crucial for further analyses. Now, 'mStat_normalize_data' function is automatically applying 'TSS' transformation. Data has been successfully normalized using TSS method. 0 features are filtered!

The filtered data has 98 samples and 49 features that will be tested!

Fit linear mixed effects models ... Completed. Error in grepl(pattern = time.var, x = df_name) : invalid 'pattern' argument

The generate_taxa_test_pair() function seems designed to optionally handle longitudinal data, given it accepts a time.var parameter, which I set to NULL. This design choice suggests it should be capable of running analyses without a time component, but the implementation does not gracefully handle a NULL time.var.

I write here a piece of the core code of the generate_taxa_test_pair() function:

Case where time.var is NULL

  if (is.null(time.var)) {
    if (is.null(group.var)) {
      fixed_effects <- adj.vars_str
      if (is.null(fixed_effects)) {
        fixed_effects <- "1"  # Intercept-only model
      }
    } else {
      if (!is.null(adj.vars_str)) {
        fixed_effects <- paste(adj.vars_str, "+", group.var)
      } else {
        fixed_effects <- group.var
      }
    }
    random_effects <- paste("(1 |", subject.var, ")")

Do you have any solution for this issue?

Thank you in advance I hope I can improve to implement the use of this package. Best Regards, Valentina

16svale avatar Feb 18 '24 17:02 16svale