Miloš Marić
Miloš Marić
When creating indices in IndexCache synchronize generation and communication streams using events, not `cudaStreamSynchronize()` Continuation of #318
When doing all-vs-all do not look for anchors of reads with the same `read_id`. Currently we are doing this for the sake of code simplicity, but this introduces additional overlaps...
When copying indices from host to device use stream callback functions to update the state indices upon copy completion. Continuation of #318
If a pair of indices causes OOM error even when no other indices are kept on device (Issue #490, also see #489) split indices into several smaller indices and find...
Keep a list of pairs of indices which were skipped due to an OOM error (Issue #489). Once all other pairs of indices have been processed go over skipped ones...
Currently `Overlapper::post_process_overlaps()` and new functionality to be added in PR #422 run on CPU. They should be moved to GPU. There are two reason for that: a) One matcher +...
Allocating/resizing some host arrays (e.g. when creating host copies of indices) takes a significant amount of time. Implement host pool allocator. Also evaluate the possibility of using implementations from Thrust...
`SketchElement`/`Minimizer` objects are not used anymore. `IndexGPU` internally relies on `SketchElementImpl::ReadidPositionDirection`. Its output consists of the content of `SketchElementImpl::ReadidPositionDirection` split into three separate arrays. Look into ways to: 1) Change...
Comment on https://github.com/lh3/ksw2/blob/4e0a1ccba8c6ccc87e0342c9712531bde783bf90/ksw2_extd2_sse.c#L255 (and similar comments below) should read `d |= a > 0? 1
In `ksw_extd2_see()` if valid elements of an antidiagonal have target indices in range `[st0, en0]` all elements in `[st=st0/16*16, en=(en0+16)/16*16-1]` are going to be processed, i.e. all elements within 16-element...