Phylign icon indicating copy to clipboard operation
Phylign copied to clipboard

Alignment against all pre-2019 bacteria on laptops within a few hours (former MOF-Search)

Results 12 Phylign issues
Sort by recently updated
recently updated
newest added

Firstly, all files were prepared and checked. ``` $ ls cobs/ | head -n 3 achromobacter_xylosoxidans__01.cobs_classic.xz acinetobacter_baumannii__01.cobs_classic.xz acinetobacter_baumannii__02.cobs_classic.xz $ ls asms/ | head -n 3 achromobacter_xylosoxidans__01.tar.xz acinetobacter_baumannii__01.tar.xz acinetobacter_baumannii__02.tar.xz $ grep...

There are some API changes between snakemake versions. With 8.5.2 ``` snakemake: error: unrecognized arguments: --cluster=/homes/shenwei/.config/snakemake/lsf/lsf_submit.py --cluster-status=/homes/shenwei/.config/snakemake/lsf/lsf_status.py --cluster-cancel=/homes/shenwei/.config/snakemake/lsf/lsf_cancel.py ``` With 6.2.0 ``` snakemake: error: unrecognized arguments: --cluster-cancel=lsf_cancel.py ``` With 7.0...

bug

Hello, In this pull request, I included: 1. a **modification of the Snakefile** to support a larger number of input files. Why? when using ~400 files, the code crashed because...

Thought this was already implemented, but apparently not. It would be nice to support gzipped files in the input folder.

enhancement

* Based on one of the reviewers comments * A bit tricky – not all mappers can work in 1 command * The way to go: create a bash script...

enhancement

See https://github.com/karel-brinda/mof-search/blob/e8e681b67538c3eadff2e577581a36183cd27303/scripts/batch_align.py#L150-L154 This clearly does not scale well when the query fasta is massive (e.g. read sets). One easy and quick way to save a bit more RAM is to...

This is a note for the future. When too many matches are reported by COBS (eg large Illumina experiments with many matches due to reads being short), the post-processing of...

fix_later

Happens with eg 1M queries This is the problematic part: https://github.com/karel-brinda/mof-search/blob/e79b0c842ed919f1787a3071a52065d2317c8f71/scripts/final_stats.py#L109 Probably should be possible optimize by that the output is sorted according to ref (so it's sufficient to keep...

bug
paper

Currently the filtering is done based on `nb_best_hits`. However, it some situations we want to pass all outputs of COBS, eg if we want to re-run the mapping part multiple...

enhancement

Currently, if the aggregation of results fails, no `match` benchmark is produced

enhancement
paper