microeco
microeco copied to clipboard
tidy_taxonomy function do not properly perform tax_table clean up
Hello ChiLiubio, First of all, I would like to thank you for your amazing work in developing microeco package. It has been a relief for us, microbiologists without a solid R programming foundation, to have stumbled upon your creation. I am utilizing tidy_taxonomy to get rid of the unassigned/unknown tax in my tax_table, so they do not show up in my plots, but so far, I have not made it work, and the unassigned taxa keep on showing Here is a snippet of the code I am using:
tidy_taxonomy( MicroEcoDataObject$tax_table, column = "all", pattern = c(".unassigned.", ".uncultur.", ".unknown.", ".unidentif.", ".unclassified.", ".No blast hit.", ".Incertae.sedis."), replacement = "", ignore.case = TRUE, na_fill = "" ) I am afraid I cannot provide the .qza I obtained in QIIME2 for building the Microtable object microeco package uses but I will be happy to provide by email if needed.
Thank you very much in advance
Hi. Do you mean the function does not work? Please show the full steps that I can judge whether it comes from extra issue. I guess it should be normal if it is properly used, as this function is simple to work. I list the steps for you to check the data.
tmp_raw <- MicroEcoDataObject$tax_table
# please first use the default params to check it
tmp_new <- tidy_taxonomy(tmp_raw)
View(tmp_raw)
View(tmp_new)
Hello, I have tried to insert your suggestion code into mine, but still, unknown taxa keep on appearing when creating a taxa heatmap This is the original workflow I used before your suggestion:
Importing .qza from QIIME2
taxonomy_microeco <- "/home/victor/Documentos/BioinformaticsPisaUbuntu/Replica1RECOVERsoilUMHbioinformatics/DataWO234/Taxonomy/taxonomy.qza" tree_microeco <- "/home/victor/Documentos/BioinformaticsPisaUbuntu/Replica1RECOVERsoilUMHbioinformatics/DataWO234/tree/rooted-tree.qza" table_microeco <- "/home/victor/Documentos/BioinformaticsPisaUbuntu/Replica1RECOVERsoilUMHbioinformatics/DataWO234/dirDADA2/table.qza" rep_microeco <- "/home/victor/Documentos/BioinformaticsPisaUbuntu/Replica1RECOVERsoilUMHbioinformatics/DataWO234/dirDADA2/representative-sequences.qza"
Creating the data frame for metadata
metadata2microeco <- data.frame( SampleID = c("S10-16S", "S11-16S", "S9-16S", "S12-16S", "S1-16S", "S8-16S", "S5-16S", "S6-16S", "S7-16S"), Time = c("T60", "T60", "T60", "T60", "T0", "T60", "T60", "T60", "T60"), Group = c("Inoculum", "Inoculum", "Inoculum", "Inoculum", "No Inoculum", "No Inoculum", "No Inoculum", "No Inoculum", "No Inoculum"), Type = c("LDPE", "LLDPE", "No plastic", "FILM", "No plastic", "FILM", "No plastic", "LDPE", "LLDPE")
Using file2meco package to create Microtable Object
MicroEcoDataObject <- qiime2meco(table_microeco, sample_table = metadata2microeco, taxonomy_table = taxonomy_microeco, phylo_tree = tree_microeco, rep_fasta = rep_microeco, auto_tidy = TRUE) MicroEcoDataObject
Filtering taxa_table using tidy_taxonomy function
tidy_taxonomy( MicroEcoDataObject$tax_table, column = "all", pattern = c(".unassigned.", ".uncultur.", ".unknown.", ".unidentif.", ".unclassified.", ".No blast hit.", ".Incertae.sedis."), replacement = "", ignore.case = TRUE, na_fill = "" )
Creating taxa heatmap
TaxaHeatmap <- trans_abund$new(dataset = MicroEcoDataObject, taxrank = "Genus", ntaxa = 40) TaxaHeatmap$plot_heatmap(facet = c("Type","Group"), xtext_keep = FALSE, withmargin = FALSE, plot_breaks = c(0.01, 0.1, 1, 10))
This is the heatmap. As you may see, weird taxa assignation like 67-14 or MB-A2-108 keep on appearing. I do not know if tidy_taxonomy is equipped to remove these labels
Thank you very much in advance
Hi. First you should assign back to MicroEcoDataObject$tax_table
when using tidy_taxonomy
.
MicroEcoDataObject$tax_table <- tidy_taxonomy(MicroEcoDataObject$tax_table)
Second, those taxa like 67-14 or MB-A2-108 are not the generally so-called useless features. The function cannot filter them with the default parameters. If you want to filter all those taxa with numbers, you can add the item in the patten parameter like this
Problem solved!! Thank you very much @ChiLiubio