sourmash
sourmash copied to clipboard
Application of `f_unique_to_query` and `threshold_bp`
Hi,
I am using Sourmash to profile bacteria composition and abundance in shotgun WGS stool samples, and have two questions:
Could you expand on what you mean by this statement with regards to the f_unique_to_query column?: 'This column should be used in any analysis that needs to avoid double-counting matches.' Currently, I am using all the rows in the output table, am I double counting by not 'using' f_unique_to_query?
My current parameters are k=31, s=1000, threshold_bp=2000 In your experience will this low threshold return a very high number of false positives?
Many thanks