htsjdk icon indicating copy to clipboard operation
htsjdk copied to clipboard

Fix EdgeReadIterator

Open michaelgatzen opened this issue 2 years ago • 2 comments

  • Fixing #1615
    • EdgeReadIterator was calling AbstractLocusIterator's getCurrentInterval() to determine the read overlaps. However, the lastInterval that it uses internally was never updated. This PR removes the method getCurrentInterval() and instead implements this functionality in EdgeReadIterator, which is the only place this has ever been called from (at least in Picard and GATK). This is arguably a breaking API change, although I would argue that this method has never worked and should not be placed in AbstractLocusIterator, but instead in an iterator implementation that actually does keep track of the intervals.
    • Keeping track of the intervals has now been moved to AbstractLocusIterator
    • Added a new function that determines if a read is fully contained by the current interval, in that case we don't need to determine the overlaps
    • Renamed a few variables to better describe their use
  • Added tests to catch previously faulty behavior
    • Also extended existing tests with the possibility to define custom headers
  • Fixed another bug in AbstractLocusIterator and IntervalListReferenceSequenceMask, where the documentation says that it would sort an unsorted input IntervalList, but it never does

michaelgatzen avatar Jun 30 '22 14:06 michaelgatzen

Codecov Report

Merging #1616 (d564490) into master (489c419) will increase coverage by 0.022%. The diff coverage is 81.633%.

@@               Coverage Diff               @@
##              master     #1616       +/-   ##
===============================================
+ Coverage     69.840%   69.862%   +0.022%     
- Complexity      9684      9696       +12     
===============================================
  Files            703       703               
  Lines          37752     37770       +18     
  Branches        6131      6139        +8     
===============================================
+ Hits           26366     26387       +21     
+ Misses          8932      8928        -4     
- Partials        2454      2455        +1     
Impacted Files Coverage Δ
...mtools/util/IntervalListReferenceSequenceMask.java 72.973% <33.333%> (+2.703%) :arrow_up:
...va/htsjdk/samtools/util/AbstractLocusIterator.java 69.945% <50.000%> (-1.004%) :arrow_down:
...in/java/htsjdk/samtools/util/EdgeReadIterator.java 82.090% <92.105%> (+4.731%) :arrow_up:
src/main/java/htsjdk/samtools/util/Interval.java 70.000% <0.000%> (+3.333%) :arrow_up:
.../htsjdk/samtools/util/AbstractRecordAndOffset.java 84.615% <0.000%> (+3.846%) :arrow_up:
...htsjdk/samtools/util/nio/DeleteOnExitPathHook.java 90.476% <0.000%> (+9.524%) :arrow_up:

codecov-commenter avatar Jul 25 '22 19:07 codecov-commenter

@lbergelson Thanks for the review, and all the deep digging you did for that! I addressed most of the comments, feel free to comment or resolve the threads!

michaelgatzen avatar Jul 25 '22 21:07 michaelgatzen