crb-blast icon indicating copy to clipboard operation
crb-blast copied to clipboard

Using the length-evalue function for blast best hits (BH)

Open 000generic opened this issue 6 years ago • 1 comments

Hi!

crb-blast seems to work really great - I'm finding 17-39% improvement over blastp (both with default evalue cutoffs of 10-5) in identifying RBHs between de novo squid and pygmy squid transcriptomes vs a set of animal genomes (see attached).

To annotate my transcriptomes I would like to use in order of preference:

RBH (reciprocal best hit )to human, fish, fly, worm, oyster, snail, octopus, anemone BH (best hit) to human, fish, fly, worm, oyster, snail, octopus, anemone NH (no hit) to human, fish, fly, worm, oyster, snail, octopus, anemone

with human preferred as the annotation species over fish over fly over worm etc.

Given that the function produced by crb-blast provides greater sensitivity over a flat cutoff, I was wondering if its possible for me to extract the evalue for a given length from the files already produced. And/or could a future version of crb-blast produce additional BH files for ( query -> target ) and ( target -> query ). This would expand the number of genes that could be annotated - and I can imagine could have other uses.

Thank-you, Eric

RBH methods - squid and pygmy squid.pdf

000generic avatar Mar 05 '18 22:03 000generic

Hey @000generic, the information you want should be in the two files evalues_data (output here) and fitting_data (output here) in the output directory. Let me know if you need more help using that information.

blahah avatar May 28 '20 16:05 blahah