Biostrings icon indicating copy to clipboard operation
Biostrings copied to clipboard

Give `matchProbePair` the `fixed` parameter to allow IUPAC matching?

Open zachary-foster opened this issue 5 years ago • 8 comments

I recently need to predict amplicons using primers with lots of ambiguity codes. The matchProbePair function seemed perfect but it does not interpret the IUPAC codes. I noticed the code uses matchPattern to find primer matches and that function has the fixed option that allows interpretation of ambiguity codes when fixed=FALSE.

Can the matchProbePair function also have this option and pass its value to matchPattern like so?

matchProbePair <- function(Fprobe, Rprobe, subject, algorithm="auto",
             logfile=NULL, verbose=FALSE, fixed = FALSE)
    {
        ## This won't copy the data if Fprobe and Rprobe are already DNAString objects
        F <- DNAString(Fprobe)
        R <- DNAString(Rprobe)

        ## F and R hits on the + strand
        Fp_hits <- start(matchPattern(F, subject, algorithm=algorithm, fixed = fixed))
        Rp_hits <- start(matchPattern(R, subject, algorithm=algorithm, fixed = fixed))

        ## F and R hits on the - strand
        Fm_hits <- end(matchPattern(reverseComplement(F), subject, algorithm=algorithm, fixed = fixed))
        Rm_hits <- end(matchPattern(reverseComplement(R), subject, algorithm=algorithm, fixed = fixed))

        if (verbose) {
            cat("Fp_hits:", Fp_hits, "  Rp_hits:", Rp_hits,
                "  Fm_hits:", Fm_hits, "  Rm_hits:", Rm_hits, "\n")
        }

        matches0 <- Biostrings:::reduceProbePairMatches(c(Fp_hits, Rp_hits), c(Fm_hits, Rm_hits))
        ans <- Views(subject, start=matches0$start, end=matches0$end)

        if (!is.null(logfile)) {
            nFp <- length(Fp_hits)
            nRp <- length(Rp_hits)
            nFm <- length(Fm_hits)
            nRm <- length(Rm_hits)
            nmatches0 <- length(ans)
            ## cat("", ..., sep="\t") is a trick to get an extra tab
            cat("", nFp, nRp, nFm, nRm, nmatches0, file=logfile, sep="\t")
        }
        ans
}

I used the code above to do my analysis and it seemed to work.

I can submit a PR if you are interested.

Thanks!

zachary-foster avatar Aug 21 '19 17:08 zachary-foster

Pull requests are always encouraged. Please prepare and we will look it over more closely. Thank you.

lshep avatar Sep 03 '19 12:09 lshep

Hi @zachary-foster @lshep

has there been any update on this issue?

@zachary-foster Is your function above working as intended?

Thank you for your help.

johanneswerner avatar Apr 09 '20 16:04 johanneswerner

Thanks @zachary-foster for code. This was very helpful.

louellette avatar Feb 12 '21 11:02 louellette

@johanneswerner Sorry, I did not notice your question originally. The function above has been working as far as I can tell. If I remember correctly, I tried to get the package built on my computer to set up a pull request, but could not get the package to build for some reason. Since I could not run the tests to ensure that the changes would not cause problems, I gave up for the time and have not gotten back to it.

@louellette No problem!

zachary-foster avatar Feb 12 '21 20:02 zachary-foster

@zachary-foster Do you still remember what you have done? Maybe we can make this pull request set up together and add this functionality.

johanneswerner avatar Feb 15 '21 07:02 johanneswerner

I looked at the repo still on my computer and I basically just made the options max.mismatch, min.mismatch, with.indels, fixed from matchPattern available to the matchProbePair function call. Attached is the version of the matchProbePair source with my changes. Its the only file I changed.

matchProbePair.R.txt

zachary-foster avatar Feb 16 '21 19:02 zachary-foster

Thank you @zachary-foster

devtools::check() currently fails on the current state of the package due to unresolved dependencies to BiocGenerics, those should hopefully be resolved when R 4.1 is released and I can switch to the Bioconductor devel channel. I will then try to include your changes and try to rebuild the package. I think the only thing missing would be additional tests ... I hope. :-) Thank you for sharing your code.

johanneswerner avatar Feb 18 '21 17:02 johanneswerner

Sounds good, thanks!

zachary-foster avatar Feb 18 '21 17:02 zachary-foster

Just merged in this functionality. matchProbePair now allows a ... argument that passes any further arguments to matchPattern, allowing for the functionality you're describing without manually listing out all the arguments.

ahl27 avatar Jun 06 '24 19:06 ahl27