how to get output similar to blastp default
Hi, Would it be possible to get an output that is similar to blastp default output? I am running another program that parses the blastp default output to get top hits.
Here is the example of blastp output ran using blastp -evalue 1e-10 -db astral40 -query astral-scopedom-seqres-gd-sel-gs-bib-40-2.07.fa -out blastp_rslt.txt
BLASTP 2.11.0+
Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.
Database: astral-scopedom-seqres-gd-sel-gs-bib-40-2.07.fa
14,323 sequences; 2,611,087 total letters
Query= d1dlwa_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Ciliate
(Paramecium caudatum) [TaxId: 5885]}
Length=116
Score E
Sequences producing significant alignments: (Bits) Value
d1dlwa_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Ciliate (Par... 231 5e-81
d2gkma_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Mycobacteriu... 87.8 3e-24
d4i0va_ a.1.1.1 (A:) automated matches {Synechococcus sp [TaxId: ... 67.8 2e-16
d6bmea_ a.1.1.0 (A:) automated matches {Green alga (Chlamydomonas... 56.2 6e-12
>d1dlwa_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Ciliate
(Paramecium caudatum) [TaxId: 5885]}
Length=116
Score = 231 bits (588), Expect = 5e-81, Method: Compositional matrix adjust.
Identities = 116/116 (100%), Positives = 116/116 (100%), Gaps = 0/116 (0%)
Query 1 SLFEQLGGQAAVQAVTAQFYANIQADATVATFFNGIDMPNQTNKTAAFLCAALGGPNAWT 60
SLFEQLGGQAAVQAVTAQFYANIQADATVATFFNGIDMPNQTNKTAAFLCAALGGPNAWT
Sbjct 1 SLFEQLGGQAAVQAVTAQFYANIQADATVATFFNGIDMPNQTNKTAAFLCAALGGPNAWT 60
Query 61 GRNLKEVHANMGVSNAQFTTVIGHLRSALTGAGVAAALVEQTVAVAETVRGDVVTV 116
GRNLKEVHANMGVSNAQFTTVIGHLRSALTGAGVAAALVEQTVAVAETVRGDVVTV
Sbjct 61 GRNLKEVHANMGVSNAQFTTVIGHLRSALTGAGVAAALVEQTVAVAETVRGDVVTV 116
>d2gkma_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Mycobacterium
tuberculosis, HbN [TaxId: 1773]}
Length=127
Score = 87.8 bits (216), Expect = 3e-24, Method: Compositional matrix adjust.
Identities = 40/115 (35%), Positives = 65/115 (57%), Gaps = 0/115 (0%)
Query 1 SLFEQLGGQAAVQAVTAQFYANIQADATVATFFNGIDMPNQTNKTAAFLCAALGGPNAWT 60
S+++++GG A++ V F+ + AD ++ FF+G +M K F AALGGP +T
Sbjct 13 SIYDKIGGHEAIEVVVEDFFVRVLADDQLSAFFSGTNMSRLKGKQVEFFAAALGGPEPYT 72
Query 61 GRNLKEVHANMGVSNAQFTTVIGHLRSALTGAGVAAALVEQTVAVAETVRGDVVT 115
G +K+VH G++ F+ V GHL ALT AGV + + + + V + DV +
Sbjct 73 GAPMKQVHQGRGITMHHFSLVAGHLADALTAAGVPSETITEILGVIAPLAVDVTS 127
Thank you
You can get this format using the option -f0.
Yes. Thank you for your reply. I apparently cannot read. Thank you for making this tool. I'm still amazed at how fast this is compared to blast. Very needed tool!
Sorry for re-opening this issue but I found the output with -f 0 still missing the list of significant alignments that is in the blastp result. It's missing this part:
Score E
Sequences producing significant alignments: (Bits) Value
d1dlwa_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Ciliate (Par... 231 5e-81
d2gkma_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Mycobacteriu... 87.8 3e-24
d4i0va_ a.1.1.1 (A:) automated matches {Synechococcus sp [TaxId: ... 67.8 2e-16
d6bmea_ a.1.1.0 (A:) automated matches {Green alga (Chlamydomonas... 56.2 6e-12
This is what I had from diamond output with -f 0 option:
BLASTP 2.3.0+
Query= d1dlwa_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Ciliate (Paramecium caudatum) [TaxId: 5885]}
Length=116
>d1dlwa_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Ciliate (Paramecium caudatum) [TaxId: 5885]}
Length=116
Score = 220 bits (561), Expect = 6.42e-77
Identities = 116/116 (100%), Positives = 116/116 (100%), Gaps = 0/116 (0%)
Query 1 SLFEQLGGQAAVQAVTAQFYANIQADATVATFFNGIDMPNQTNKTAAFLCAALGGPNAWT 60
SLFEQLGGQAAVQAVTAQFYANIQADATVATFFNGIDMPNQTNKTAAFLCAALGGPNAWT
Sbjct 1 SLFEQLGGQAAVQAVTAQFYANIQADATVATFFNGIDMPNQTNKTAAFLCAALGGPNAWT 60
Query 61 GRNLKEVHANMGVSNAQFTTVIGHLRSALTGAGVAAALVEQTVAVAETVRGDVVTV 116
GRNLKEVHANMGVSNAQFTTVIGHLRSALTGAGVAAALVEQTVAVAETVRGDVVTV
Sbjct 61 GRNLKEVHANMGVSNAQFTTVIGHLRSALTGAGVAAALVEQTVAVAETVRGDVVTV 116
>d4i0va_ a.1.1.1 (A:) automated matches {Synechococcus sp. [TaxId: 32049]}
Length=123
Score = 58.2 bits (139), Expect = 9.11e-13
Identities = 38/119 (31%), Positives = 59/119 (49%), Gaps = 6/119 (5%)
Query 1 SLFEQLGGQAAVQAVTAQFYANIQADATVATFFNGIDMPNQTNKTAAFLCAALGGPNAWT 60
SL+E+LGG AAV +FY + AD V FF DM Q F+ A GG + +
Sbjct 2 SLYEKLGGAAAVDLAVEKFYGKVLADERVNRFFVNTDMAKQKQHQKDFMTYAFGGTDRFP 61
Query 61 GRNLKEVHA----NMGVSNAQFTTVIGHLRSALTGAGVAAALVEQTVAVAETV--RGDV 113
GR+++ H N G+++ F + +L L V+ L+++ V + +V R DV
Sbjct 62 GRSMRAAHQDLVENAGLTDVHFDAIAENLVLTLQELNVSQDLIDEVVTIVGSVQHRNDV 120
Query= d2gkma_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Mycobacterium tuberculosis, HbN [TaxId: 1773]}
Length=127
Is the difference because of the different version of blastp?
Thank you!
Sorry for the late reply. Yes, the BLAST-like table is missing in this format. I need to add this when I find the time.