sparseMatrixStats icon indicating copy to clipboard operation
sparseMatrixStats copied to clipboard

colProds and rowProds should accept method= argument

Open LTLA opened this issue 4 years ago • 8 comments

Even if it doesn't do anything with them. Just add a ... to both method definitions so that they don't throw when someone passes in method= in a generic.

LTLA avatar Sep 27 '20 02:09 LTLA

Same for type= for rowQuantiles and colQuantiles, though in this case, it seems like it would be best to respond to that choice; you might just copy the loop over rows/columns in matrixStats.

LTLA avatar Sep 27 '20 02:09 LTLA

Similarly, colRanks has preserve.shape= instead of preserveShape=.

Why not put these arguments in the generic so that they are guaranteed to be consistent across methods?

LTLA avatar Sep 27 '20 02:09 LTLA

I added the ... to [col|row]Prods() and also fixed the preserveShape argument name in colRanks.

Why not put these arguments in the generic so that they are guaranteed to be consistent across methods?

matrixStats supports some method arguments

  • dim. everywhere
  • method in rowProds
  • preserveShape in rowRanks
  • drop in rowQuantiles

that seemed like low-level optimization details specific for matrixStats.

I am aware that this can be surprising, because it means that sparseMatrixStats is not a perfect drop-in replacement.

If I had to rank: I am fairly certain that dim. is something specific for matrixStats and should not appear in the generic. I think the case for method is similar. I am less sure about preserveShape and the more I think about drop, I actually believe this should be in the generic.

Maybe we should rather discuss this in the MatrixGenerics repo, so that Pete can chime in as well :)

Same for type= for rowQuantiles and colQuantiles, though in this case, it seems like it would be best to respond to that choice; you might just copy the loop over rows/columns in matrixStats

I am undecided what to do about this one, because currently I am relying on my own quantile_sparse function. I could of course expand every column to a dense vector and call the quantile(type=...) function, however I would really like to avoid this. The alternative, is to actually understand how the type arguments differ and add that functionality to the quantile_sparse() function. Maybe for the mean time, I should just stop() if type != 7.

const-ae avatar Sep 28 '20 11:09 const-ae

Okay, I decided that I am fine for now with densifying each column if colQuantiles() is called with a non-default type argument.

I opened a PR at https://github.com/Bioconductor/MatrixGenerics/pull/14 to include drop and type in the signature of the generic method.

const-ae avatar Sep 28 '20 16:09 const-ae

I also realized why preserveShape cannot be in the generic signature of colRanks:

If the generic would look like this:

setGeneric("colRanks", function(x, rows = NULL, cols = NULL, ties.method = c("max", "average"), preserveShape = FALSE ...) standardGeneric("colRanks"),
           signature = "x"
)

The code fails because the .default_colRanks function here would be called with dim. = preserveShape.

If the preserveShape argument is moved behind the ..., I get a mismatch between the matrixStats and the MatrixGenerics method signature.

However, I could also be missing something here, so if there is a possibility to include preserveShape without including dim., I would be up to changing the method signature :)

const-ae avatar Sep 28 '20 17:09 const-ae

An expedient solution might be to swap the order of preserveShape and dim. in .default_colRanks.

LTLA avatar Sep 28 '20 22:09 LTLA

Good point, but I worry that this would be even more surprising.

const-ae avatar Sep 29 '20 08:09 const-ae

Seems like the choice is between that of developer-level surprise, where we have to make sure that the call inside .default_colRanks swaps the arguments; or user-level surprise, where people might start wondering where preserveShape= went and if it is supported in all methods for this generic. Given that preserveShape= actually affects the shape of the output, avoiding the latter appears to be more important.

LTLA avatar Sep 29 '20 15:09 LTLA