guacamole
guacamole copied to clipboard
push down filtering by locus into a predicate
When calling variants at only a subset of loci, it would be great if we could push that into an ADAM predicate to avoid loading the whole dataset. This would mean writing a predicate that takes a LociSet and a window size, and filters overlapping reads. It would also involve some refactoring, including the loadReads() and loci() functions in Common, since right now the filtering is done separately after loading.
We can do this if we are loading ADAM reads or some other parquet serialized data. Thoughts are on our approach to this @ryan-williams @timodonnell