lambda icon indicating copy to clipboard operation
lambda copied to clipboard

Add Aligned Query Seq/Subject Seq to M8 output?

Open BGemler opened this issue 6 years ago • 6 comments

Would it be possible to add an option to --outfmt (for .m8/.m9 output file extensions) for searchp and searchn to return the aligned portions of the query sequence/subject sequence?

Thank you

BGemler avatar Apr 01 '19 22:04 BGemler

In theory that's possible, however development focus has switched to making the new version of lambda: lambda3.

I will see how much work this feature would be for lambda2, though.

h-2 avatar Apr 03 '19 16:04 h-2

Understood. Thank you!

BGemler avatar Apr 03 '19 17:04 BGemler

Hello,

Following up on the above item to see if it's in development for Lambda3. We're processing large amounts of data in pairwise format because we need the aligned portions of the query/subject sequences (including gaps when applicable) - if it would be possible to include these in the tabular output in Lambda3 it would be awesome!

BGemler avatar Jan 30 '20 15:01 BGemler

We would need to get changes into SeqAn2 for this. I am not sure that is still possible, but I will investigate!

h-2 avatar Aug 15 '22 13:08 h-2

Hey @h-2 - saw some activity on this repo! Are there any updates on when LAMBDA3 will be released?

BGemler avatar Feb 14 '23 19:02 BGemler

Hi! Yes, there is quite a bit of development and many improvements and new features on the horizon. I don't have an exact date, yet, but the protein mode on the lambda3 branch is already fairly stable.

Regarding this feature request: We will still be using the SeqAn2 code for writing the output files. This means that we will not be able to do major changes to the Blast output, unfortunately.

I currently see the following options:

  1. Use .m0 and parse the strings from that. Not ideal, because it is an ugly format, but possible.
  2. Use the positions printed in .m8 and extract manually. I guess, you are doing something similar right now, but it is not very ergonomic of course.
  3. Use SAM or BAM output. They currently already offer printing the matching query sequence (SEQ field / qs field) + the hard-clip option. Interfaces for this are more modular, and we would be able to add an option for also printing the subject sequence if this is important to you!

h-2 avatar Feb 14 '23 19:02 h-2