cooler
cooler copied to clipboard
default value instead of np.nans for cooler matrix
Many algorithms need a matrix that has simply zeroes for missing bins, not NANs.
It would be nice to have a default value for c.matrix(balance=True)
can NaNs raise error messages? I'm getting the following warning, not sure if it's caused by missing bins or something else:
WARNING:py.warnings:/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py:2920: RuntimeWarning: Mean of empty slice.
out=out, **kwargs)
WARNING:py.warnings:/usr/local/lib/python3.6/dist-packages/numpy/core/_methods.py:85: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
WARNING:py.warnings:/usr/local/lib/python3.6/dist-packages/cooler/balance.py:355: RuntimeWarning: invalid value encountered in greater
logNzMarg = np.log(marg[marg>0])
WARNING:py.warnings:/usr/local/lib/python3.6/dist-packages/cooler/balance.py:359: RuntimeWarning: invalid value encountered in less
bias[marg < cutoff] = 0
Yes, NaNs can cause warnings like that. It looks like the marg
array acquired at least one NaN value during the MAD-max filter step, where the marginals (row sums) are divided by their chromosomal median value.
This is probably from 0/0 operations coming from chromosomes with no reads (like chrY or chrM), so it is safe to ignore, assuming you are getting normal looking output.
It would be a good idea to avoid or suppress these warnings internally, though.