DelayedArray icon indicating copy to clipboard operation
DelayedArray copied to clipboard

Re-implementing rowsum/colsum for HDF5Matrix

Open PeteHaitch opened this issue 6 years ago • 3 comments

I just noticed that bsseq has been broken in devel for a while due to the removal of the rowsum,HDF5Matrix-method (which I okayed back in https://github.com/Bioconductor/DelayedArray/commit/1a6a44f0119d81ff5eae975a37b4eb4efe85efab#comments).

I now remember why I had this method: if the data are so large that they need to be stored in HDF5 then the result of rowsum()/colsum() may itself be so large that it can't be stored in memory. Therefore, the method I had allowed for this by allowing the result to be written to an HDF5RealizationSink (it also allowed me some extra control over the result, such as writing 2 separate rowsum()s to the same file as in https://github.com/PeteHaitch/DelayedMatrixStats/commit/27712d726a78350819961d4c10e50ef900118ef7).

Would it be possible to have a rowsum()/colsum() that supports realizing the result to a RealizationSink as it goes? I will try to find time to do this if you're okay with it.

PeteHaitch avatar Feb 25 '19 05:02 PeteHaitch

I see. I didn't realize you had a use case for writing the result of rowsum() or colsum() to a file, sorry. I will implement this.

hpages avatar Mar 05 '19 17:03 hpages

Any chance of getting this change sooner than later so bsseq can pass on the devel builder before the release? Its been broken since 2/01/19

lshep avatar Mar 28 '19 13:03 lshep

Hi @hpages, will you have time to do this before the BioC 3.9 deadline or should I backport the functionality to bsseq as an internal helper function?

PeteHaitch avatar Apr 17 '19 08:04 PeteHaitch