deepTools icon indicating copy to clipboard operation
deepTools copied to clipboard

getScorePerBigWigBin.py error with output from bigWigMerge

Open perinom opened this issue 2 years ago • 1 comments

Hi, thanks for the amazing tools!

I'm having an issue with ComputeMatrix when providing bigwigs generated via bigWigMerge.

briefly, my pipeline is

# generate single replicate bw files
[...]

# merge bigwig (output bedgraph)
bigWigMerge file1.bw file2.bw merged.bedgraph

# convert merged to bw
bedGraphToBigWig merged.bedgraph chr_sises.txt merged.bw

Now, when I run ComputeMatrix on these merged bw I get

[bwHdrRead] There was an error while reading in the header!
[pyBwOpen] bw is NULL!
Traceback (most recent call last):
  File "/myDir/miniconda3/envs/deep_test/bin/computeMatrix", line 14, in <module>
    main(args)
  File "/myDir//miniconda3/envs/deep_test/lib/python3.7/site-packages/deeptools/computeMatrix.py", line 421, in main
    hm.computeMatrix(scores_file_list, args.regionsFileName, parameters, blackListFileName=args.blackListFileName, verbose=args.verbose, allArgs=args)
  File "/myDir/miniconda3/envs/deep_test/lib/python3.7/site-packages/deeptools/heatmapper.py", line 251, in computeMatrix
    chromSizes, _ = getScorePerBigWigBin.getChromSizes(score_file_list)
  File "/myDir/miniconda3/envs/deep_test/lib/python3.7/site-packages/deeptools/getScorePerBigWigBin.py", line 158, in getChromSizes
    fh = pyBigWig.open(fname)
RuntimeError: Received an error during file opening!

The starting bw pre-merge (generated with bedGraphToBigWig) are fine and run without complaints. Both pre-merge and post merge bw can be loaded in IGV w/o problems and look as they should.

The only thing I noticed looking at the corresponding bedgraphs is that the pre-merge has a first line covering the chromosome from 0 to the first base with signal, while the post-merge start with signal regions from line one. as examples:

pre-merge

chr2L   0       5106    0.000000
chr2L   5106    5107    0.082136
chr2L   5107    5108    0.164271
chr2L   5108    5109    0.246407

post-merge

chr2L   5106    5107    0.082136
chr2L   5107    5108    0.164271
chr2L   5108    5109    0.246407
chr2L   5109    5110    0.328542

Could this be sufficient to create problems? If so, is there a quick fix?

ComputeMatrix command

computeMatrix reference-point \
            --regionsFileName $PEAKS \
            --scoreFileName $ATAC $H3K4me3 $H3K27Ac $IgG \
            --outFileName $MATRIX_DIR/$MATRIX \
            --referencePoint center --beforeRegionStartLength 2500 --afterRegionStartLength 2500 \
            --averageTypeBins mean --binSize 10 --missingDataAsZero \
            --smartLabels \
            --numberOfProcessors 14

same error message across multiple python/deeptools combinations all in fresh, dedicated conda environments

deeptools 3.5.1 in Python 3.7.12 deeptools 3.5.1 in Python 3.9.15 deeptools 3.5.1 in Python 3.10.8

deeptools 3.4.3 in Python 3.9.15

Thanks!

------------------- EDIT ------------ updated title as the same issue happened with multiBigwigSummary and the issue seems to come from getScorePerBigWigBin.py

perinom avatar Apr 04 '23 23:04 perinom

I found a workaround:

# merge the bedgraph not the bigwig with
bedtools unionbedg -i rep1.bg rep2.bg -g chr.sizes -empty | \
            awk 'BEGIN {OFS = "\t"} {avg=int(($4+$5)/2); print $1,$2,$3,avg}'  > averaged.bg

# convert combined bedgraph to be
bedGraphToBigWig  averaged.bg   chr.sizes averaged.bw

This confirms that the incompatibility is with the output format of bigWigMerge.

You can close this if you wish, but bigWigMerge being an official UCSC tool it would be great to have deeptools able to deal with its output.

Thanks again for deeptools!

perinom avatar Apr 05 '23 09:04 perinom