I use test_diff to conduct Differential enrichment analysis, I had impute my data using MLE methods in advance. I found for the same protein, the p-values derived from different runs were divergent from each other, as well as the numbers of significant proteins. Is it normal? The figures demonstrate part of the results from two runs, one contains 19 differentially expressed proteins while the other contains only 3.

Dec 23 '22 05:12 Neo-xbx-00

Did anything change in between those two runs, or was it literally just running the function test_diff consecutively on the same object?

Dec 23 '22 18:12 adomingues

Did anything change in between those two runs, or was it literally just running the function test_diff consecutively on the same object?

No changes in between these two runs. I literally just run the function test_diff consecutively on the same data frame. I found every time I run, the results are different.

Dec 24 '22 06:12 Neo-xbx-00

library("DEP") #Prepare my data from maxquant. df_protein <- read.table("proteinGroups.txt",sep = "\t",header = T) %>% filter(Reverse != "+",Potential.contaminant != "+",Only.identified.by.site != "+", Score > 20,Unique.peptides > 1) %>% select(2,64:69,81) df_experiment <- read.table("ExperimentalDesign.txt",sep = "\t",header = T) colnames(df_protein) df_protein$id %>% duplicated() %>% any() df_protein_unique <- make_unique(df_protein,"id","Majority.protein.IDs",delim = ";") colnames(df_protein_unique) df_protein_unique$name %>% duplicated() %>% any() df_LFQ <- grep("LFQ.", colnames(df_protein_unique)) df_se <- make_se(df_protein_unique, df_LFQ, df_experiment)

df_missval <- filter_missval(df_se, thr = 1)

df_norm <- normalize_vsn(df_missval)

df_imp_MLE <- DEP::impute(df_norm, fun = "MLE")

df_diff_all_contrasts <- test_diff(df_imp_MLE, type = "control", control = "FB") df_diff_all_results <- get_df_wide(df_diff_all_contrasts)

Denote significant proteins based on user defined cutoffs

df_DEP <- add_rejections(df_diff_all_contrasts, alpha = 0.05, lfc = log2(2))

df_DEP_results <- get_results(df_DEP)

df_DEP_results %>% filter(significant=="TRUE") %>% nrow()

Dec 24 '22 06:12 Neo-xbx-00

library("DEP") #Prepare my data from maxquant. df_protein <- read.table("proteinGroups.txt",sep = "\t",header = T) %>% filter(Reverse != "+",Potential.contaminant != "+",Only.identified.by.site != "+", Score > 20,Unique.peptides > 1) %>% select(2,64:69,81) df_experiment <- read.table("ExperimentalDesign.txt",sep = "\t",header = T) colnames(df_protein) df_protein$id %>% duplicated() %>% any() df_protein_unique <- make_unique(df_protein,"id","Majority.protein.IDs",delim = ";") colnames(df_protein_unique) df_protein_unique$name %>% duplicated() %>% any() df_LFQ <- grep("LFQ.", colnames(df_protein_unique)) df_se <- make_se(df_protein_unique, df_LFQ, df_experiment)

df_missval <- filter_missval(df_se, thr = 1)

df_norm <- normalize_vsn(df_missval)

df_imp_MLE <- DEP::impute(df_norm, fun = "MLE")

df_diff_all_contrasts <- test_diff(df_imp_MLE, type = "control", control = "FB") df_diff_all_results <- get_df_wide(df_diff_all_contrasts)

Denote significant proteins based on user defined cutoffs

df_DEP <- add_rejections(df_diff_all_contrasts, alpha = 0.05, lfc = log2(2))

df_DEP_results <- get_results(df_DEP)

df_DEP_results %>% filter(significant=="TRUE") %>% nrow()

This is my R script, everytime I will get a totally different result. It is quite weird. Same things also take place in example data.

Dec 24 '22 07:12 Neo-xbx-00

DEP
DEP copied to clipboard

Different p-value in each independent run?

Denote significant proteins based on user defined cutoffs

Denote significant proteins based on user defined cutoffs

DEP DEP copied to clipboard

Different p-value in each independent run?

Denote significant proteins based on user defined cutoffs

Denote significant proteins based on user defined cutoffs

DEP
DEP copied to clipboard