GenomicRanges icon indicating copy to clipboard operation
GenomicRanges copied to clipboard

nearestKNeighbors does not handle seqlevel pruning

Open yeyuan98 opened this issue 2 years ago • 2 comments

In current version (commit https://github.com/Bioconductor/GenomicRanges/commit/d20afa47c9e7acbf213bbdc1ded043bd817f2458) nearestKNeighbors will fail if subject contains seqlevels not used by x, due to this line in .nearestKNeighbors:

seqlevels(subject) <- seqlevels(x)

Changing that line to the following should suffice presumably:

seqlevels(subject, pruning.mode = "tidy") <- seqlevels(x)

Would be more than happy to create a pull request or contribute in any way as needed. Thanks a lot for the great Bioconductor infrastructure packages.

yeyuan98 avatar Jan 09 '23 19:01 yeyuan98

Sorry, the above fix will introduce new errors because pruned subject will have different indices than the original subject, making the return values meaningless.

I guess the end user should prune the subject on their own before feeding the subject into this function.

yeyuan98 avatar Jan 09 '23 19:01 yeyuan98

@yeyuan98

I guess the end user should prune the subject on their own before feeding the subject into this function.

Not totally satisfactory. We need to take a close look but hopefully there is a better way to handle this. Thanks for reporting this.

hpages avatar Jan 10 '23 16:01 hpages