hh-suite icon indicating copy to clipboard operation
hh-suite copied to clipboard

Parse hhr file

Open bognabognabogna opened this issue 4 years ago • 2 comments

Hello, I am trying to get hhblits results in a data table format. I have seen a recommendation to use the -blasttab option. (see the closed issue: Script to parse HHR output file #120)

This one however doesn't output the probabilities. Is there a way to calculate the Prob. column from the output data as generated by -blasttab or alternatively is there a way to include the Probabilities in the table?

Best wishes

bognabognabogna avatar Sep 30 '19 15:09 bognabognabogna

If still relevant: I wrote for myself an R parser for the hhr format - see function ?read.hhr. I hope it covers most of the variations in hhr. The output is a tibble that adheres to hhr as closely as possible and has the following fields:

Name Type
No integer
Hit.ID character
Hit.Description character
Q.ss_pred character
Q.query character
Q.consensus character
Q.Start integer
Q.End integer
Q.Length integer
T.consensus character
T.Start integer
T.End integer
T.Length integer
T.hit character
T.ss_dssp character
T.ss_pred character
Aligned_cols integer
E.value numeric
Identities numeric
Probab numeric
Score numeric
Similarity numeric
Sum_probs numeric
Template_Neff numeric

The metadata in the header are preserved as attributes:

Name Type
Query character
Match_columns integer
No_of_seqs character
Neff integer
Searched_HMMs integer
Date character
Command character

alephreish avatar Jul 19 '20 19:07 alephreish

The R Parser is certainly useful, and I have written a similar one myself, but I still think adding the "probability" column to the -blasttab output would be really useful for a lot of people. If the tireless developers of HHSuite found time to do it, it would be greatly appreciated (at the very least by this heavy user of HHsuite!)

sean-bam avatar Jul 23 '20 22:07 sean-bam