MACS icon indicating copy to clipboard operation
MACS copied to clipboard

What is the difference between "absolute peak summit" and "summit position" in narrowPeak format?

Open mcsimenc opened this issue 1 year ago • 1 comments

Hi, I want to understand the difference between 5 and 10 in macs3 narrowPeak output. They are different, but column 10, the "relative summit position to peak start", corresponds to what is found in the summits.bed file. In IGV, field 10 is labeled as the "peak", and field 5 is labeled as "score". Here is an example:

summits.bed

Chr1	13972	13973	peak_1	12.7509

narrowPeak

Chr1	13824	14257	peak_1	127	.	4.86034	15.5595	12.7509	148

148 = 13972 - 13824

127 = ?

Thanks so much

BED field descriptions drawn from: https://github.com/macs3-project/MACS/blob/master/docs/callpeak.md

mcsimenc avatar Feb 01 '24 18:02 mcsimenc

@mcsimenc the 5th column, according to definition of narrowPeak represent the peak score. In MACS, the score is the integar form of 10 x -log10(qvalue) (9th column). In the callpeak.md file you refer to, you can find this description:

NAME_peaks.narrowPeak is BED6+4 format file which contains the peak locations together with peak summit, p-value, and q-value. You can load it to the UCSC genome browser. Definition of some specific columns are:

5th: integer score for display. It's calculated as int(-10log10pvalue) or int(-10log10qvalue) depending on whether -p (pvalue) or -q (qvalue) is used as score cutoff. Please note that currently, this value might be out of the [0-1000] range defined in UCSC ENCODE narrowPeak format. You can let the value saturated at 1000 (i.e. p/q-value = 10^-100) by using the following 1-liner awk: awk -v OFS="\t" '{$5=$5>1000?1000:$5} {print}' NAME_peaks.narrowPeak

taoliu avatar Feb 15 '24 00:02 taoliu