cooler icon indicating copy to clipboard operation
cooler copied to clipboard

default value instead of np.nans for cooler matrix

Open mimakaev opened this issue 7 years ago • 2 comments

Many algorithms need a matrix that has simply zeroes for missing bins, not NANs.

It would be nice to have a default value for c.matrix(balance=True)

mimakaev avatar Jun 30 '17 19:06 mimakaev

can NaNs raise error messages? I'm getting the following warning, not sure if it's caused by missing bins or something else:

WARNING:py.warnings:/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py:2920: RuntimeWarning: Mean of empty slice.

out=out, **kwargs)

WARNING:py.warnings:/usr/local/lib/python3.6/dist-packages/numpy/core/_methods.py:85: RuntimeWarning: invalid value encountered in double_scalars

ret = ret.dtype.type(ret / rcount)

WARNING:py.warnings:/usr/local/lib/python3.6/dist-packages/cooler/balance.py:355: RuntimeWarning: invalid value encountered in greater

logNzMarg = np.log(marg[marg>0])

WARNING:py.warnings:/usr/local/lib/python3.6/dist-packages/cooler/balance.py:359: RuntimeWarning: invalid value encountered in less

bias[marg < cutoff] = 0

rachadele avatar Oct 03 '19 21:10 rachadele

Yes, NaNs can cause warnings like that. It looks like the marg array acquired at least one NaN value during the MAD-max filter step, where the marginals (row sums) are divided by their chromosomal median value.

This is probably from 0/0 operations coming from chromosomes with no reads (like chrY or chrM), so it is safe to ignore, assuming you are getting normal looking output.

It would be a good idea to avoid or suppress these warnings internally, though.

nvictus avatar Oct 05 '19 02:10 nvictus