rBLAST icon indicating copy to clipboard operation
rBLAST copied to clipboard

extract query and aligned sequences

Open cjieming2 opened this issue 3 years ago • 1 comments

Hi I like how rBLAST automates extraction of quantifiable values from the BLAST output from bitscores and stuff. I am in need in a project of extracting also the aligned query and database sequences. Is there currently a way in rBLAST that would allow me to output that info with all the other results? thanks!

cjieming2 avatar Jul 27 '22 01:07 cjieming2

This is currently not implemented. What would the call to blastn look like to get the desired output? I would need a complete example and then I can try to add it to rBLAST.

-Michael

mhahsler avatar Jul 27 '22 16:07 mhahsler

The 'custom_format' argument to the predict() function allows that. The aligned portions of the query and database sequences can be added as additional columns in the results data frame by calling predict() with 'custom_format = fmt_args', where 'fmt_args' is defined:

fmt_args <- "qaccver saccver pident length mismatch gapopen qstart qend sstart send evalue bitscore qseq sseq"

Unfortunately the columns names are not then as "pretty" as usual in the result data frame, but they can easily be set to the usual values.

@mhahsler - could another argument be added to predict() that would contain the desired column names?

AnotherKiwi avatar Oct 25 '23 23:10 AnotherKiwi

Thanks for finding this. I have updated the manpage with your format example.

The inconsistency with column names is not great. I decided to use now always the standard names used by BLAST. This makes the results more consistent, but unfortunately, you are losing the non-standard pretty column names.

-Michael

mhahsler avatar Oct 26 '23 16:10 mhahsler