Diamond returns longer queries than the targets it was mapped against.
Occasionally, diamond returns a hit longer than the sequences against which it was mapped.
I.E. it extends beyond the reference region. Per my understanding this should be impossible.
Top 3 sequences are the references. The problematic sequence is obvious.
Here are the target refs against which the sequences are mapped for you to create the dmnd db.
Here is a link to the genome mined.
https://sra-download.ncbi.nlm.nih.gov/traces/wgs03/wgs_aux/AA/DG/AADG06/AADG06.1.fsa_nt.gz
The command used is.
diamond blastx -d {diamond_db_path} -q {input_file.name} -o {out_path} --very-sensitive --masking 0 -e .000001 --compress 1 --outfmt 6 qseqid sseqid qframe evalue bitscore qstart qend sstart send {quiet} --top 10 --min-orf 20 --max-hsps 0 -p {num_threads}",
here is the hit as represented in the log.
NODE_62309 uce-11679_p3 -1 2.55e-06 40.4 642 202 1 34 NODE_62309 uce-11679_p7 -1 2.55e-06 40.4 642 202 1 34 NODE_62309 uce-11679_p9 -1 2.55e-06 40.4 642 202 1 34
Let me know if there is anything I can do to assist.
Running Version 2.1.8.
Your target sequences seem to be DNA or am I missing something?