galaxy_blast icon indicating copy to clipboard operation
galaxy_blast copied to clipboard

works with sequences sharing same ids

Open FredericBGA opened this issue 6 years ago • 7 comments

a need that pop-up today from one of my colleague.

FredericBGA avatar Oct 18 '19 15:10 FredericBGA

Just looking at the code I don't understand the goal here - do you have a simplified example with the before/after behaviour?

peterjc avatar Oct 18 '19 17:10 peterjc

Of course.

I had updated the tests accordingly.

before this PR, with this kind of file:

>7
ATGC
>7
ATGC

the result:

>7;7 representing 2 records
ATGC
>7;7 representing 2 records
ATGC

now the result file is:

>7;7 representing 2 records
ATGC

We often have this case, as we fetch sequences from various sources or people and try to merge them.

Is this clear for you?

FredericBGA avatar Oct 21 '19 07:10 FredericBGA

The failed tests are not related to this PR, right?

FredericBGA avatar Oct 21 '19 07:10 FredericBGA

This is input query files with repeated identifiers?

Repeated entries with the same identifier and sequence are one thing, repeated identifiers with different sequence are another. Personally I would make these an error condition - they cause too many problems downstream.

peterjc avatar Oct 21 '19 09:10 peterjc

The tool's master branch is failing on TravisCI against the Galaxy dev branch, see #120

peterjc avatar Oct 21 '19 09:10 peterjc

you're right, the output is not yet perfect in this case:

>1
A
>1
A
>1
T

the output was:

>1;1 representing 2 records
A
>1;1 representing 2 records
A
>1;1 representing 2 records
T

with this PR it's now (and not good either...):

>1;1 representing 2 records
A

FredericBGA avatar Oct 21 '19 09:10 FredericBGA

@peterjc This new version works for what I was needed. I let you review the code and merge if you want. Thank you for your comments.

FredericBGA avatar Oct 22 '19 08:10 FredericBGA