André Müller

Results 30 comments of André Müller

1) The peak memory consumption with default settings is usually around 2 times the size of the uncompressed sequences and the database size on disk is usually the same as...

You just might need to experiment a bit with all of that. I guess I would start with larger partitions (1TB or more) and reduce the partition size in case...

Sorry that I overlooked this issue for so long. We don't have any experiences with Ion Torrent files. As long as FASTA or FASTQ files contain DNA sequences (so letter...

Hi, since Metacache can classify down to the sequence level and individual sequences do not have an official NCBI taxid, we use negative taxids to identify the individual sequences in...

At the moment, there are no plans for a pure library version of MetaCache. One could of course take the core elements of the current implementation and make them into...

Hi, accession numbers are a total mess. We use a regex to identify NCBI-style accession or accession.version sequence identifiers. For some reason that I don't remember we only allow the...

This should be solved by the changes in version [v.2.4.2](https://github.com/muellan/metacache/releases/tag/v2.4.2).

In theory, the longer the reads are, the more accurate the results should become. This is certainly what we have seen for Illumina reads of different lengths. The most crucial...

If you want to distinguish reads on strain level then mapping accuracy is of course extremely important. I would first try the default settings which promise the highest accuracy and...

Just to give a little update on this issue. I've recently processed Oxford Nanopore datasets with varying read lengths of 100 up to 21000. The N50 was relatively low (in...