Robert Edgar

Results 97 comments of Robert Edgar

Hi @rchikhi Minor feature request/suggestion for future runs: can you combine all micro-assemblies into one FASTA file? This file should not be too big, only around 1 Gb or so....

Serratax provides the identity of the source organism. They allow BLAST top hits as an approximate guide down to genus, which is a bad method for our situation. Serratax is...

Can we close this? Or unassign me? From my perspective Serratax is the solution.

@taltman Can we close this? Or unassign me? From my perspective Serratax is the solution. If there is an open issue for me, please clarify, thanks.

Variant calling suggested by @ababaian from slack: ``` java -Xmx12G -jar /home/ubuntu/software/GenomeAnalysisTK.jar \ -R hgr1.gatk.fa -T HaplotypeCaller \ -ploidy 2 --max_alternate_alleles 6 \ -I $LIBRARY.bam -o $LIBRARY.hgr1.vcf ```

Assigning @taltman for coverage plot & variant analysis vs. closest genome if close enough (say, >=97% identity per the minimap2 alignment).

Sounds right to me. We need minimap2 alignments to three separate references (cannot be combined into one!) anyway for the master table: (a) refseqs, (b) complete genomes including refseqs, and...

IMO we don't need a mapping benchmark for the main Serratus search, we have settled on how to run bowtie2 and are not likely to revisit benchmarking of mappers.

You could track the query sequence length or amount of memory you're allocating (using /proc on Linux or GetProcessMemoryInfo on Windows) and issue a warning/error if much longer/more than expected....