bedtools2 icon indicating copy to clipboard operation
bedtools2 copied to clipboard

Normalizing the length of a set of variable length intervals

Open amatria opened this issue 3 years ago • 2 comments

Issue

I find it very hard to believe that bedtools lacks the option to create equal length intervals given a set of variable length intervals. Am I missing something?

I get that you can do that using the following awk command:

user@host$ TARGET_LENGTH=50
user@host$ awk -vF=${TARGET_LENGTH} 'BEGIN{ OFS="\t"; }{ len=$3-$2; diff=F-len; flank=int(diff/2); upflank=downflank=flank; if (diff%2==1) { downflank++; }; print $1, $2-upflank, $3+downflank; }' in.bed | sort-bed - > out.bed

(source: https://www.biostars.org/p/241085/)

However, what if the resulting intervals go beyond the size of a chromosome? What if the resulting interval starts below position 0?

Example

input.bed:

chr1     3    10
chr1    17    27
chr1    10    15
chr1    42    50

genome.bed:

chrom   size
chr1    50

Now, I want to normalize the intervals in input.bed so that they span 20bp each:

chr1     0    20
chr1    12    32
chr1     2    22
chr1    30    50

However, with the awk command above I would get the following result:

chr1    -4    16 # wrong
chr1    12    32
chr1     2    22
chr1    36    56 # wrong

amatria avatar Nov 08 '22 12:11 amatria

Does the makewindows tool give you what you need?

arq5x avatar Nov 30 '22 13:11 arq5x

As far as my understanding goes... Nope, makewindows does not give me what I need. makewindows can only create subintervals given a set of intervals. However, I want to create bigger intervals given a set of intervals

amatria avatar Dec 05 '22 16:12 amatria