MinHashMetagenomics icon indicating copy to clipboard operation
MinHashMetagenomics copied to clipboard

Further space improvement

Open dkoslicki opened this issue 8 years ago • 0 comments

Can significantly improve space required this by only using the k-mers that are present in the union of the training/reference genomes. This will significantly cut down on the size of the bloom filter of the sample. Would need a more creative way to estimate the cardinality of the whole sample though (e.g. Hyperloglog).

dkoslicki avatar May 29 '17 17:05 dkoslicki