Add lowConfidence transcripts to output and improve NDR filtering options
- In this PR transcripts that are normally filtered out including subset transcripts and those above the NDR threshold are placed into the metadata of the annotations in $subsetTranscripts and $lowConfidenceTranscripts respectively.
- This means that if users ran bambu with the wrong NDR setting, and do not want to run discovery again, they can get the missing transcripts from the metadata.
- To facilitate this this PR adds the external setNDR function which takes the extendedAnnotations and an NDR value and will switch novel bambu annotations between the main annotations and the low confidence annotations based on the threshold provided. If no threshold is provided, setNDR will recommend an NDR with the same method used during transcript discovery.
- In order for this to work for annotations that have already been saved to a gtf file, bambu now outputs the NDR, txScore and txScore.noFit as attributes to the gtf file and these are also read in with prepareAnnotations.
- Important to note that if annotations are written with an NDR threshold of <1, these low confidence transcripts will be missed.
- Added setNDR as part of quant, which means that users can provide their extendedAnnotations alongside an NDR threshold when running bambu and it will automatically adjust the NDR used for quant. This means users do not need to manually filter the NDR value themselves.
- NDR and other stats are now copied over to equal transcripts even if above the NDR threshold (previously only happened for those below the NDR threshold)
Minor change: Warnings will no longer occur if there are seqlevels in the readGrgList that are not in the annotations or genome. This was done by setting seqlevels of the reads to only those in the reads. Warning was constantly occuring because all the scaffolds used in alignment were in the bam files, even if no reads from these scaffolds existed.
Todo - Unit Tests, Update bambu documentation to include setNDR
Codecov Report
:exclamation: No coverage uploaded for pull request base (
devel@f547dca). Click here to learn what that means. The diff coverage isn/a.
:exclamation: Current head 50f7896 differs from pull request most recent head e159b48. Consider uploading reports for the commit e159b48 to get more accurate results
:mega: This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more
@@ Coverage Diff @@
## devel #367 +/- ##
========================================
Coverage ? 85.34%
========================================
Files ? 24
Lines ? 3303
Branches ? 0
========================================
Hits ? 2819
Misses ? 484
Partials ? 0
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more