MicrobiotaProcess
MicrobiotaProcess copied to clipboard
can't calculate metaphlan's output
when i do mpse3 <- mp_import_metaphlan(profile="merged_abundance_table.txt", mapfilename="meta.txt") mpse3 %<>% mp_rrarefy() This is the output Error in vegan::rrarefy(x = .data, sample = raresize) : function is meaningful only for integers (counts)
please refer to this issues. I recommend using the latest version, because some important features were added, you can use remotes::install_github("YuLab-SMU/MicrobiotaProcess")
.
请参考这个问题。我建议使用最新版本,因为添加了一些重要功能,您可以使用
remotes::install_github("YuLab-SMU/MicrobiotaProcess")
.
thankyou!So all the orders I used mp_*?
Yes, you can also refer to the vignette.
What is the error, please provide the example. As mentioned in the issues.
The mp_rrarfy
need your .abundace
should be integers (count), but the output of metaphlan3 is the relative abundance (adjusted by the genome length and the sample depth), you can multiple the sample depth, then round to keep the integer part (this is also the solution of curatedMetagenomicData
). Or you can use the relative abundance to perform the following analysis directly with force=TRUE
in some functions (?mp_cal_alpha to see the help information of the function). This means the specified .abundance will be used to calculate directly not be rarefied (default)
The mpse3
is from the output of MetaPhlAn3
> mpse3
# A MPSE-tibble (MPSE object) abstraction: 33,464 × 36
# OTU=356 | Samples=94 | Assays=Abundance | Taxanomy=Kingdom, Phylum, Class, Order, Family, Genus
OTU Sample Abundance study_name subject_id body_site antibiotics_curr…
<chr> <chr> <dbl> <chr> <chr> <chr> <chr>
1 s__Fusica… S1_a_… 23.2 IjazUZ_20… subIJ_1 stool no
2 s__Bifido… S1_a_… 16.7 IjazUZ_20… subIJ_1 stool no
3 s__Collin… S1_a_… 11.9 IjazUZ_20… subIJ_1 stool no
4 s__Rumino… S1_a_… 10.9 IjazUZ_20… subIJ_1 stool no
5 s__Bifido… S1_a_… 10.7 IjazUZ_20… subIJ_1 stool no
6 s__Bifido… S1_a_… 5.77 IjazUZ_20… subIJ_1 stool no
7 s__Anaero… S1_a_… 4.03 IjazUZ_20… subIJ_1 stool no
8 s__Dorea_… S1_a_… 3.81 IjazUZ_20… subIJ_1 stool no
9 s__Agatho… S1_a_… 2.15 IjazUZ_20… subIJ_1 stool no
10 s__Eggert… S1_a_… 1.74 IjazUZ_20… subIJ_1 stool no
# … with 33,454 more rows, and 29 more variables: study_condition <chr>,
# disease <chr>, age <int>, age_category <chr>, gender <chr>, country <chr>,
# non_westernized <chr>, sequencing_platform <chr>, PMID <chr>,
# number_reads <int>, number_bases <dbl>, minimum_read_length <int>,
# median_read_length <int>, NCBI_accession <chr>, curator <chr>,
# family <chr>, days_from_first_collection <int>, family_role <chr>,
# disease_subtype <chr>, treatment <chr>, disease_location <chr>, …
First, you can calculate the relative abundance for the sample or group level
> mpse3 %<>% mp_cal_abundance(.abundance=Abundance, force=TRUE, relative=TRUE) %>% mp_cal_abundance(.abundance=Abundance, .group=disease, force=TRUE, relative=TRUE)
> mpse3
# A MPSE-tibble (MPSE object) abstraction: 33,464 × 37
# OTU=356 | Samples=94 | Assays=Abundance, RelAbundanceBySample | Taxanomy=Kingdom, Phylum, Class, Order, Family, Genus
OTU Sample Abundance RelAbundanceByS… study_name subject_id body_site
<chr> <chr> <dbl> <dbl> <chr> <chr> <chr>
1 s__Fusica… S1_a_W… 23.2 23.2 IjazUZ_20… subIJ_1 stool
2 s__Bifido… S1_a_W… 16.7 16.7 IjazUZ_20… subIJ_1 stool
3 s__Collin… S1_a_W… 11.9 11.9 IjazUZ_20… subIJ_1 stool
4 s__Rumino… S1_a_W… 10.9 10.9 IjazUZ_20… subIJ_1 stool
5 s__Bifido… S1_a_W… 10.7 10.7 IjazUZ_20… subIJ_1 stool
6 s__Bifido… S1_a_W… 5.77 5.77 IjazUZ_20… subIJ_1 stool
7 s__Anaero… S1_a_W… 4.03 4.03 IjazUZ_20… subIJ_1 stool
8 s__Dorea_… S1_a_W… 3.81 3.82 IjazUZ_20… subIJ_1 stool
9 s__Agatho… S1_a_W… 2.15 2.15 IjazUZ_20… subIJ_1 stool
10 s__Eggert… S1_a_W… 1.74 1.74 IjazUZ_20… subIJ_1 stool
# … with 33,454 more rows, and 30 more variables:
# antibiotics_current_use <chr>, study_condition <chr>, disease <chr>,
# age <int>, age_category <chr>, gender <chr>, country <chr>,
# non_westernized <chr>, sequencing_platform <chr>, PMID <chr>,
# number_reads <int>, number_bases <dbl>, minimum_read_length <int>,
# median_read_length <int>, NCBI_accession <chr>, curator <chr>,
# family <chr>, days_from_first_collection <int>, family_role <chr>, …
Then you can perform the different analysis with mp_diff_analysis
> mpse3 %<>% mp_diff_analysis(.abundance=Abundance, force=TRUE, relative=TRUE, .group=disease)
> mpse3
# A MPSE-tibble (MPSE object) abstraction: 33,464 × 43
# OTU=356 | Samples=94 | Assays=Abundance, RelAbundanceBySample | Taxanomy=Kingdom, Phylum, Class, Order, Family, Genus
OTU Sample Abundance RelAbundanceByS… study_name subject_id body_site
<chr> <chr> <dbl> <dbl> <chr> <chr> <chr>
1 s__Fusica… S1_a_W… 23.2 23.2 IjazUZ_20… subIJ_1 stool
2 s__Bifido… S1_a_W… 16.7 16.7 IjazUZ_20… subIJ_1 stool
3 s__Collin… S1_a_W… 11.9 11.9 IjazUZ_20… subIJ_1 stool
4 s__Rumino… S1_a_W… 10.9 10.9 IjazUZ_20… subIJ_1 stool
5 s__Bifido… S1_a_W… 10.7 10.7 IjazUZ_20… subIJ_1 stool
6 s__Bifido… S1_a_W… 5.77 5.77 IjazUZ_20… subIJ_1 stool
7 s__Anaero… S1_a_W… 4.03 4.03 IjazUZ_20… subIJ_1 stool
8 s__Dorea_… S1_a_W… 3.81 3.82 IjazUZ_20… subIJ_1 stool
9 s__Agatho… S1_a_W… 2.15 2.15 IjazUZ_20… subIJ_1 stool
10 s__Eggert… S1_a_W… 1.74 1.74 IjazUZ_20… subIJ_1 stool
# … with 33,454 more rows, and 36 more variables:
# antibiotics_current_use <chr>, study_condition <chr>, disease <chr>,
# age <int>, age_category <chr>, gender <chr>, country <chr>,
# non_westernized <chr>, sequencing_platform <chr>, PMID <chr>,
# number_reads <int>, number_bases <dbl>, minimum_read_length <int>,
# median_read_length <int>, NCBI_accession <chr>, curator <chr>,
# family <chr>, days_from_first_collection <int>, family_role <chr>, …
>
Finally, you can visualize the result of different analysis with mp_plot_diff_res
mpse3 %>% mp_plot_diff_res(pwidth.abun=0.03, tiplab.size=1.2, offset.effsize=0.5)