Does lib_cut only remove samples with a minimum sequencing threshold? Modification of these parameter below the minimum threshold of sequences significantly affects my results.
I have divided my phyloseq file into two groups (males and females) with the subset_sample function. In the case of males, the sample with the lowest number of sequences has 6765 reads, if I set the lib_cut to 1000 “lib_cut = 1000” I get the following error:
Error in { : task 1 failed - "Zero variances have been detected for the following taxa: e34a053cddf79b4d5afbff62ad711416, f0ec1e3ab9e5dd882c308f475cce2294, 22d275e5dc69935461a8dd3e0712ca79, c02197cbcd90125f7e1898facd3ac488, 8ec607a59fdadaf5f1236c1336195e81, 312b6cc51514af31ac441b021374eef3, 1c2efc3afcba41f09a1fe4aadace6961, 8529911d4908eaf1c7de2806da98b4f4, 9432e20a1484dcc9a6c2d23328c723ad, 48758e23e84547d15eddcba8b9644499, 2128a381aed59d3742f8e36532c967d8, 9aefe06948373d61f9c15fa27cb2cfe7, 77453ac3539bd8a7f83f84867d518901, ec472bccbec3f3d21a16fa1eea09c27e, daac4678ae5f16ab6006d44c0ff94da3, 33c989846b5d5192297ecb6f16b20d35, 7a09a885a1e1ddc2813115e2a0eb93e7, bbcd32ec102a58f2b3c4307450bb7f58, 1e60e1a2cff0c36a1806486ef8b8381f, 15a1edfe28815461da4e988900a412b1, b78098e09b769388a6f961049124f1e7, efc7e628b3a50dcc424d45843a3883de, 26bbcf81d1ee6134ec34e80be77c0347, 12030900f98ebae2c07b7ce3374a3fa8, 457719094fc29014429c338b7751e502, 848a7e42914b2166ed25e02dcf76d353, 5040895610d58443624f7036cf48eed0
I´ll check some of these taxa and has and abundance value in many samples. Therefore, i think that there are fine. However, If I set it to 6000 “lib_cut = 6000” the analysis runs perfectly and gives me a result, why do things change so dramatically, if I understand lib_cut is a parameter that is used only to discard samples based on a sequencing threshold, therefore, any modification below the minimum number of sequences that my samples present should not affect my results because it should not eliminate any sample. In other analyses that I perform with these same males, but with other response variables, I do not get this error, everything works perfectly regardless of modifying lib_cut always below the minimum number of sequences. In females also everything works correctly, regardless of modifying lib_cut below the minimum number of sequences, nothing changes, as expected.
¿Any idea?
Sorry for the inconvenience, but I think I found a strange solution. In brief, i got the error when comparing males from two localities “Coquimatlán” and “Chamela” with lib_cut = 1000, which apparently was solved when I set lib_cut = 6000, although this modification should not affect my results because the sample with the lowest number of sequences was 6765. Now, going back to the solution, in alphabetical order ANCOMBC2 used Chamela as reference group, I rearranged the order of both factors with the following function:
sample_data(IMLY6_Male)$Location = factor(sample_data(IMLY6_Male)$Location, c("Coquimatlán","Chamela"))
So that Coquimatlán was the reference group and everything worked correctly with lib_cut = 1000, in fact, it gave an almost identical result (understand that the magnitudes were inverted because I changed the order of the reference group) to the one I got when I used lib_cut = 6000. Therefore, I believe that the analysis was performed correctly. Is the problem due to the accent on the letter "a" in the word “Coquimatlán”?