microeco
microeco copied to clipboard
plot_lefse_cladogram stuck in default use_taxa_num = 200
Hi , Thank you very much for developing this good package to doing LEfSe anaylsis in R. LEfSe cladogram base on ggtree is more clear and beautiful than original version.
I use a test dataset (1735 otu and15 samples) to do LEfSe in microeco, When execute $plot_lefse_cladogram stuck a long time in default use_taxa_num = 200 ,
In this dataset, I try to decrease the number of use_taxa_num to 124, then it's work fine and speed (~0.45s) and when use_taxa_num > 124, it stuck > 1hr and no output.
use the source code microeco/R/trans_diff.R to check step by step
it's stuck in this line
tree <- ggtree::ggtree(tree, size = 0.2, layout = 'circular')
Code as follows:
library(microeco)
library(ggtree)
meco_qiime2 = readRDS("meco_qiime2.rds")
meco_qiime2$cal_abund()
lefse_diff = trans_diff$new(dataset = meco_qiime2, method = "lefse", group = "Group")
png("test_lefse_caldogram.png",width = 4000,height = 3200,res = 300)
lefse_diff$plot_lefse_cladogram(use_taxa_num = 124, use_feature_num = 40, clade_label_level = 4,alpha = 0.2)
dev.off()
RDS file and caldogram microeco_issue.zip
R session
> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS Linux 8 (Core)
Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.3.so
locale:
[1] LC_CTYPE=zh_TW.UTF-8 LC_NUMERIC=C LC_TIME=zh_TW.UTF-8 LC_COLLATE=zh_TW.UTF-8
[5] LC_MONETARY=zh_TW.UTF-8 LC_MESSAGES=zh_TW.UTF-8 LC_PAPER=zh_TW.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=zh_TW.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] microeco_0.3.3
loaded via a namespace (and not attached):
[1] Rcpp_1.0.5 BiocManager_1.30.10 pillar_1.4.7 compiler_4.0.3 RColorBrewer_1.1-2 plyr_1.8.6
[7] tools_4.0.3 digest_0.6.27 aplot_0.0.6 jsonlite_1.7.2 ggtree_2.4.1 tidytree_0.3.3
[13] lifecycle_0.2.0 tibble_3.0.4 gtable_0.3.0 nlme_3.1-149 lattice_0.20-41 mgcv_1.8-33
[19] pkgconfig_2.0.3 rlang_0.4.10 Matrix_1.2-18 rstudioapi_0.13 rvcheck_0.1.8 patchwork_1.1.1
[25] parallel_4.0.3 treeio_1.14.3 dplyr_1.0.2 stringr_1.4.0 cluster_2.1.0 generics_0.1.0
[31] vctrs_0.3.6 grid_4.0.3 tidyselect_1.1.0 glue_1.4.2 data.table_1.13.6 R6_2.5.0
[37] farver_2.0.3 tidyr_1.1.2 ggplot2_3.3.3 purrr_0.3.4 reshape2_1.4.4 magrittr_2.0.1
[43] scales_1.1.1 ellipsis_0.3.1 MASS_7.3-53 splines_4.0.3 permute_0.9-5 ape_5.4-1
[49] colorspace_2.0-0 labeling_0.4.2 stringi_1.5.3 lazyeval_0.2.2 munsell_0.5.0 crayon_1.3.4
[55] vegan_2.5-7
Thanks you a lot!
Hi, @chikao0817
The issue comes from the wired taxonomy information-"c__c__" in the abund_table of lefse_diff: k__Bacteria|p__Candidatus_Melainabacteria|c__c__|o__Vampirovibrionales Actually, this part of code has the checking step and filtering for those taxonomy. However, this is only designed for the unified taxonomy assignment, like the taxonomy info in the example table, which has been transformed with the tidy_taxonomy() function. So if you use tidy_taxonomy() on the meco_qiime2$tax_table, the "c__c__" will be converted to "c__", thus in the lefse_diff, this line "k__Bacteria|p__Candidatus_Melainabacteria|c__|o__Vampirovibrionales" can be filtered by the checking code automatically. This issue will no exist. Now to solve this problem, you can check the abund_table in lefse_diff$abund_table or use tidy_taxonomy() on the meco_qiime2$tax_table and recalculate the lefse_diff. In my view, a uniform taxonomy information is very important in many data analysis methods. It is ok in these codes:
meco_qiime2$tax_table %<>% tidy_taxonomy meco_qiime2$tidy_dataset() meco_qiime2$cal_abund() lefse_diff <- trans_diff$new(dataset = meco_qiime2, method = "lefse", group = "Group") lefse_diff$plot_lefse_cladogram(use_taxa_num = 200, use_feature_num = 40, clade_label_level = 4,alpha = 0.2)
Chi
Thanks for the quick reply :)
It's very useful to use tidy_taxonomy() to clean taxonomy table, and make a uniform taxonomy information. Then solve this problem.
Thank you very much.