diamond icon indicating copy to clipboard operation
diamond copied to clipboard

Diamond returns longer queries than the targets it was mapped against.

Open kevinmoran1988 opened this issue 2 years ago • 1 comments

Occasionally, diamond returns a hit longer than the sequences against which it was mapped.

I.E. it extends beyond the reference region. Per my understanding this should be impossible.

image

Top 3 sequences are the references. The problematic sequence is obvious.

Here are the target refs against which the sequences are mapped for you to create the dmnd db.

UCE-11679.txt

Here is a link to the genome mined.

https://sra-download.ncbi.nlm.nih.gov/traces/wgs03/wgs_aux/AA/DG/AADG06/AADG06.1.fsa_nt.gz

The command used is.

diamond blastx -d {diamond_db_path} -q {input_file.name} -o {out_path} --very-sensitive --masking 0 -e .000001 --compress 1 --outfmt 6 qseqid sseqid qframe evalue bitscore qstart qend sstart send {quiet} --top 10 --min-orf 20 --max-hsps 0 -p {num_threads}",

here is the hit as represented in the log.

NODE_62309 uce-11679_p3 -1 2.55e-06 40.4 642 202 1 34 NODE_62309 uce-11679_p7 -1 2.55e-06 40.4 642 202 1 34 NODE_62309 uce-11679_p9 -1 2.55e-06 40.4 642 202 1 34

Let me know if there is anything I can do to assist.

Running Version 2.1.8.

kevinmoran1988 avatar Oct 20 '23 01:10 kevinmoran1988

Your target sequences seem to be DNA or am I missing something?

bbuchfink avatar Oct 27 '23 09:10 bbuchfink