valr icon indicating copy to clipboard operation
valr copied to clipboard

option for shuffle to maintain spacing

Open jayhesselberth opened this issue 7 years ago • 3 comments

from the paper:

The p-value and the direction of difference from the null hypothesis (that the positions of Q and R are independent) are obtained by permutation. Each permutation randomizes the query intervals uniformly across the chromosome, maintaining the spacing between intervals.

emphasis added. seems like a reasonable option for bed_shuffle().

jayhesselberth avatar Nov 14 '16 12:11 jayhesselberth

the easiest way to do this would be:

  1. calculate interval spacing
  2. pick a random start for the first interval
  3. use the spacings to set the coordinates for the other intervals.

would need to deal with spacing per chromosome, and also make sure that all intervals on a chrom are in-bounds, given the random start.

jayhesselberth avatar Nov 16 '16 19:11 jayhesselberth

for simplicity, the above should probably ignore incl and excl params. and issue a message to that effect.

jayhesselberth avatar Nov 17 '16 19:11 jayhesselberth

Could also pick a random offset with the spacing dists, using that as the first and then circling back when the end is reached.

jayhesselberth avatar Nov 19 '16 12:11 jayhesselberth

closing for now

jayhesselberth avatar Oct 05 '22 13:10 jayhesselberth