diamond icon indicating copy to clipboard operation
diamond copied to clipboard

how to get output similar to blastp default

Open olechnwin opened this issue 2 years ago • 4 comments

Hi, Would it be possible to get an output that is similar to blastp default output? I am running another program that parses the blastp default output to get top hits.

Here is the example of blastp output ran using blastp -evalue 1e-10 -db astral40 -query astral-scopedom-seqres-gd-sel-gs-bib-40-2.07.fa -out blastp_rslt.txt

BLASTP 2.11.0+


Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.


Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.



Database: astral-scopedom-seqres-gd-sel-gs-bib-40-2.07.fa
           14,323 sequences; 2,611,087 total letters



Query= d1dlwa_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Ciliate
(Paramecium caudatum) [TaxId: 5885]}

Length=116
                                                                      Score     E
Sequences producing significant alignments:                          (Bits)  Value

d1dlwa_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Ciliate (Par...  231     5e-81
d2gkma_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Mycobacteriu...  87.8    3e-24
d4i0va_ a.1.1.1 (A:) automated matches {Synechococcus sp [TaxId: ...  67.8    2e-16
d6bmea_ a.1.1.0 (A:) automated matches {Green alga (Chlamydomonas...  56.2    6e-12


>d1dlwa_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Ciliate 
(Paramecium caudatum) [TaxId: 5885]}
Length=116

 Score = 231 bits (588),  Expect = 5e-81, Method: Compositional matrix adjust.
 Identities = 116/116 (100%), Positives = 116/116 (100%), Gaps = 0/116 (0%)

Query  1    SLFEQLGGQAAVQAVTAQFYANIQADATVATFFNGIDMPNQTNKTAAFLCAALGGPNAWT  60
            SLFEQLGGQAAVQAVTAQFYANIQADATVATFFNGIDMPNQTNKTAAFLCAALGGPNAWT
Sbjct  1    SLFEQLGGQAAVQAVTAQFYANIQADATVATFFNGIDMPNQTNKTAAFLCAALGGPNAWT  60

Query  61   GRNLKEVHANMGVSNAQFTTVIGHLRSALTGAGVAAALVEQTVAVAETVRGDVVTV  116
            GRNLKEVHANMGVSNAQFTTVIGHLRSALTGAGVAAALVEQTVAVAETVRGDVVTV
Sbjct  61   GRNLKEVHANMGVSNAQFTTVIGHLRSALTGAGVAAALVEQTVAVAETVRGDVVTV  116


>d2gkma_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Mycobacterium 
tuberculosis, HbN [TaxId: 1773]}
Length=127

 Score = 87.8 bits (216),  Expect = 3e-24, Method: Compositional matrix adjust.
 Identities = 40/115 (35%), Positives = 65/115 (57%), Gaps = 0/115 (0%)

Query  1    SLFEQLGGQAAVQAVTAQFYANIQADATVATFFNGIDMPNQTNKTAAFLCAALGGPNAWT  60
            S+++++GG  A++ V   F+  + AD  ++ FF+G +M     K   F  AALGGP  +T
Sbjct  13   SIYDKIGGHEAIEVVVEDFFVRVLADDQLSAFFSGTNMSRLKGKQVEFFAAALGGPEPYT  72

Query  61   GRNLKEVHANMGVSNAQFTTVIGHLRSALTGAGVAAALVEQTVAVAETVRGDVVT  115
            G  +K+VH   G++   F+ V GHL  ALT AGV +  + + + V   +  DV +
Sbjct  73   GAPMKQVHQGRGITMHHFSLVAGHLADALTAAGVPSETITEILGVIAPLAVDVTS  127

Thank you

olechnwin avatar May 26 '23 20:05 olechnwin

You can get this format using the option -f0.

bbuchfink avatar May 30 '23 10:05 bbuchfink

Yes. Thank you for your reply. I apparently cannot read. Thank you for making this tool. I'm still amazed at how fast this is compared to blast. Very needed tool!

olechnwin avatar May 30 '23 13:05 olechnwin

Sorry for re-opening this issue but I found the output with -f 0 still missing the list of significant alignments that is in the blastp result. It's missing this part:

                                                                      Score     E
Sequences producing significant alignments:                          (Bits)  Value

d1dlwa_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Ciliate (Par...  231     5e-81
d2gkma_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Mycobacteriu...  87.8    3e-24
d4i0va_ a.1.1.1 (A:) automated matches {Synechococcus sp [TaxId: ...  67.8    2e-16
d6bmea_ a.1.1.0 (A:) automated matches {Green alga (Chlamydomonas...  56.2    6e-12

This is what I had from diamond output with -f 0 option:

BLASTP 2.3.0+


Query= d1dlwa_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Ciliate (Paramecium caudatum) [TaxId: 5885]}

Length=116

>d1dlwa_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Ciliate (Paramecium caudatum) [TaxId: 5885]}
Length=116

 Score = 220 bits (561),  Expect = 6.42e-77
 Identities = 116/116 (100%), Positives = 116/116 (100%), Gaps = 0/116 (0%)

Query    1  SLFEQLGGQAAVQAVTAQFYANIQADATVATFFNGIDMPNQTNKTAAFLCAALGGPNAWT 60
            SLFEQLGGQAAVQAVTAQFYANIQADATVATFFNGIDMPNQTNKTAAFLCAALGGPNAWT
Sbjct    1  SLFEQLGGQAAVQAVTAQFYANIQADATVATFFNGIDMPNQTNKTAAFLCAALGGPNAWT 60

Query   61  GRNLKEVHANMGVSNAQFTTVIGHLRSALTGAGVAAALVEQTVAVAETVRGDVVTV 116
            GRNLKEVHANMGVSNAQFTTVIGHLRSALTGAGVAAALVEQTVAVAETVRGDVVTV
Sbjct   61  GRNLKEVHANMGVSNAQFTTVIGHLRSALTGAGVAAALVEQTVAVAETVRGDVVTV 116

>d4i0va_ a.1.1.1 (A:) automated matches {Synechococcus sp. [TaxId: 32049]}
Length=123

 Score = 58.2 bits (139),  Expect = 9.11e-13
 Identities = 38/119 (31%), Positives = 59/119 (49%), Gaps = 6/119 (5%)

Query    1  SLFEQLGGQAAVQAVTAQFYANIQADATVATFFNGIDMPNQTNKTAAFLCAALGGPNAWT 60
            SL+E+LGG AAV     +FY  + AD  V  FF   DM  Q      F+  A GG + + 
Sbjct    2  SLYEKLGGAAAVDLAVEKFYGKVLADERVNRFFVNTDMAKQKQHQKDFMTYAFGGTDRFP 61

Query   61  GRNLKEVHA----NMGVSNAQFTTVIGHLRSALTGAGVAAALVEQTVAVAETV--RGDV 113
            GR+++  H     N G+++  F  +  +L   L    V+  L+++ V +  +V  R DV
Sbjct   62  GRSMRAAHQDLVENAGLTDVHFDAIAENLVLTLQELNVSQDLIDEVVTIVGSVQHRNDV 120

Query= d2gkma_ a.1.1.1 (A:) Protozoan/bacterial hemoglobin {Mycobacterium tuberculosis, HbN [TaxId: 1773]}

Length=127

Is the difference because of the different version of blastp?

Thank you!

olechnwin avatar Jun 21 '23 16:06 olechnwin

Sorry for the late reply. Yes, the BLAST-like table is missing in this format. I need to add this when I find the time.

bbuchfink avatar Aug 21 '23 13:08 bbuchfink