Multiplex_Major_Patch
Work in progress. Do not use or run
In this PR transcripts that are normally filtered out including subset transcripts and those above the NDR threshold are placed into the metadata of the annotations in $subsetTranscripts and $lowConfidenceTranscripts respectively. This means that if users ran bambu with the wrong NDR setting, and do not want to run discovery again, they can get the missing transcripts from the metadata. To facilitate this this PR adds the external setNDR function which takes the extendedAnnotations and an NDR value and will switch novel bambu annotations between the main annotations and the low confidence annotations based on the threshold provided. If no threshold is provided, setNDR will recommend an NDR with the same method used during transcript discovery. In order for this to work for annotations that have already been saved to a gtf file, bambu now outputs the NDR, txScore and txScore.noFit as attributes to the gtf file and these are also read in with prepareAnnotations. Important to note that if annotations are written with an NDR threshold of <1, these low confidence transcripts will be missed. Added setNDR as part of quant, which means that users can provide their extendedAnnotations alongside an NDR threshold when running bambu and it will automatically adjust the NDR used for quant. This means users do not need to manually filter the NDR value themselves. NDR and other stats are now copied over to equal transcripts even if above the NDR threshold (previously only happened for those below the NDR threshold) Minor change: Warnings will no longer occur if there are seqlevels in the readGrgList that are not in the annotations or genome. This was done by setting seqlevels of the reads to only those in the reads. Warning was constantly occuring because all the scaffolds used in alignment were in the bam files, even if no reads from these scaffolds existed.
Todo - Unit Tests, Update bambu documentation to include setNDR