dorado
dorado copied to clipboard
Simplex read id in multiple duplex pairs
Hi guys,
we are investigating our first duplex run. I've read useful discussions in #316 and #327 but couldn't find an obvious explanation to what we see.
From the docs and issues we see that a simplex read (let's call this r
) that are also part of a duplex pair (let's call it d
) is tagged dx:i:-1
and the corresponding duplex pair (d
, indeed) is in the form r,t
(r
and t
are the read names) and is tagged as dx:i:1
.
An example is this read here (0ee988dd-2227-47f7-ab19-99acfc66d686
), with the corresponding tags.
d69f94b2-51d2-4c61-8c3b-7104c6cccc2a;0ee988dd-2227-47f7-ab19-99acfc66d686 1
0ee988dd-2227-47f7-ab19-99acfc66d686 -1
So far so good, and indeed most of the simplex reads having a duplex pair follow this scheme.
There are, however, simplex reads (dx:i:-1
) that have multiple duplex pairs, so that the read r
appears in a first duplex r,t
and in a second duplex q,r
.
An example is this read here(d69f94b2-51d2-4c61-8c3b-7104c6cccc2a
):
d69f94b2-51d2-4c61-8c3b-7104c6cccc2a;0ee988dd-2227-47f7-ab19-99acfc66d686 1
ed2df147-bb5c-4215-98e6-69b7ed90b01c;d69f94b2-51d2-4c61-8c3b-7104c6cccc2a 1
d69f94b2-51d2-4c61-8c3b-7104c6cccc2a -1
What is happening here ? Are the other 2 ids basically referring to the same template read d69f94b2-51d2-4c61-8c3b-7104c6cccc2a
but are partial duplex of 2 different part of it ? Something like this (https://github.com/nanoporetech/dorado/issues/327#issuecomment-1691714958) but at different ends? I'm just guessing as I couldn't find anything related to this - sorry if I missed it.
Thanks,
Davide