SparseArrays.jl icon indicating copy to clipboard operation
SparseArrays.jl copied to clipboard

`dropzeros!` for `SubArray`

Open btmit opened this issue 2 years ago • 5 comments

It would be awesome if dropzeros! were extended so the following pattern worked:


# create a sparse array
a = rand(4, 3)
as = sparse(a)

# loop over columns
for c in eachcol(as)
    # do something that creates some nonstructural zeros
    c[rand(1:4)] = 0
    # drop zeros per column subarray
    # dropzeros!(c)  # fails due to missing method
end

Obviously in this example I could call dropzeros! on the entire matrix at the end, but in other cases it may be more natural to nest it in a per-loop function.

btmit avatar Nov 07 '22 18:11 btmit

but is that a good idea? i think batching it is going to be much faster due to hitting GC/data shift.

SobhanMP avatar Nov 08 '22 01:11 SobhanMP

This is going to be really slow as Sobhan mentioned. You're going to be shifting every column higher than the one you're working on up or down each time.

This could work fine if we had pending insertions/deletions held in a COO format. SuiteSparse:GraphBLAS has that, and I have some version of it in testing elsewhere, but it would be too difficult to add it here.

rayegun avatar Nov 08 '22 02:11 rayegun

I'm not familiar with the implementation - just coming at it from the user side. To work around this I currently have to 1) copy sparse matrix to vector of sparse vectors 2) perform column-wise operations on sparse vectors including dropzeros!, then 3) copy back to a sparse matrix. Would it be slower than that?

btmit avatar Nov 08 '22 15:11 btmit

It's probably a bit better but what you are describing is not a good idea in general. The better idea is to do as much as computation as you can (maybe 100 columns?) and then call dropzero!. The problem is that any call to dropzeros will shift the internal structure of the array (if there are any zeros), that 's going to stay true even if we implement it for columns.

If you still think you have a valid use-case, would you mind giving a short example of what you're doing?

SobhanMP avatar Nov 09 '22 02:11 SobhanMP

Sure, I'll give it a shot. First, I want to say that I now understand why you think this is a bad idea and if the efficiency of dropzeros! was the driver of my application's run-time I would happily refactor it to pull dropzeros! to the top level.

More generally I also understand the hesitation to implement a method that is inherently "slow" because then you'll start seeing issues regarding that. I think that's a general challenge for Julia and maybe just languages in general. I've certainly been on the other end where I find a new tool only to learn that it is terribly slow and not useful to me (e.g. mapslices).

Short answer is that I can work around what exists in SparseArrays now. It's just a little awkward.

Here's some pseudocode for a better MWE:

# input: `mat` - some very large (possibly sparse) matrix

# modify mat inplace and return some result
function top!(mat::AbstractMatrix, params)
    # view of each or groups of multiple columns. this is applications specific
    blocks = chopmat(mat, params) 
    # foo! is some function that changes each block
    # in certain cases it may set additional elements to 0
    # to me it seems more natural to let foo! handle the call to dropzeros! down at the low level
    result = foo!.(blocks, Ref(params))
    # dropzeros!(mat)  # it's awkward to put dropzeros here because:
    # 1) mat may not be a sparse type
    # 2) it may not be needed for all foo! methods
    # 3) it's an implementation detail that you're bringing to the top level
    return result
end

# different methods for foo! based on if it's sparse, a vector, etc
foo!(x::Union{SparseVector, SubArray{T, 1, <:AbstractSparseArray}}, params) where T = ...
foo!(x...

chopmat(mat, ::Case1) = collect(eachcol(mat))

function chopmat(mat, params::Case2)
    # decide block size driven by application, not computational efficiency
    inds = bar(sizeof(mat), params)  
    return view.(Ref(mat), :, inds)
end

Thanks again for the attention. Happy to take suggestions at the top level as well.

btmit avatar Nov 09 '22 14:11 btmit