Da Zheng
Da Zheng
each matrix has a data id to identify the data in the matrix. when a virtual matrix is materialized, we should give the same data id as the virtual matrix...
groupby_row on a wide matrix might output a matrix with many rows. To improve performance, the wide matrix needs to be split and stored in a block matrix.
```R > mat mat res res$d [1] 2.273649e+03 4.091728e+02 4.088768e+02 4.085853e+02 4.083932e+02 [6] 4.081581e+02 4.080538e+02 4.077418e+02 4.072318e+02 4.070689e+02 [11] 1.106819e-05 7.270199e-06 6.023668e-06 2.217967e-06 1.690857e-06 [16] 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00...
The operations work different on matrices of different shapes and different data layout.
to save I/O, we check input during materialization. if a virtual matrix can't be materialized, we need to propagate the error to the users.
The current implementation assumes that a matrix is either tall or wide. However, we now need to handle square matrices. is_wide() is fall for a square matrix and the transpose...
test for getting rows/cols. what happens when the index has negative values or some values out of range. we also want to handle the binary index as well.
lazily evaluating computation on small matrices isn't necessary.
in diag(crossprod(mat)), we only need to compute the values in the diagonal.
we don't need both fm.agg.mat and fm.agg.mat.lazy.