sprs icon indicating copy to clipboard operation
sprs copied to clipboard

Ergonomics could be improved

Open dylanede opened this issue 8 years ago • 6 comments

After trying out the library for a bit, I've found some places in the API that are a bit painful to work with at the moment. Number one is the (lack of) interop with dense matrices and vectors. It would be very nice to be able to construct a sparse array from a Vec<N>, for example, or dot product a sparse vector with a Vec<N>. I won't mention too much more about dense storage interop since you've got some issues opened about this.

One set of very useful things that is missing is support for per-element operations between matrices/vectors. For example if I want to multiply one vector by a constant and add it to another. I think the most practical way for the time being of supporting per-element operations would be to add some functions with differing numbers of arguments that take some borrowed matrices/vectors, and a closure to operate on them to produce the result, or maybe write into an existing matrix/vector. For example, something like

fn per_element1<N, F: FnMut(N) -> N>(a: CsVecView<N>, f: F) -> CsVecOwned<N>;
fn per_element2<N, F: FnMut(N, N) -> N>(a: CsVecView<N>, b: CsVecView<N>, f: F) -> CsVecOwned<N>;
fn per_element3<N, F: FnMut(N, N, N) -> N>(a: CsVecView<N>, b: CsVecView<N>, c: CsVecView<N>, f: F) -> CsVecOwned<N>;

(If this is to support dense vector/matrix interop, it may be worth considering a trait for vector views that dense vectors/matrices can implement as well as sparse ones) (Edit: There may need to be two sets of functions - ones for operation only on non-zero elements, and ones that operate on all elements, so that it is possible to have sparse usage of the function f)

Another more basic set of things that is missing is more operations on elements of matrices/vectors. E.g. retrieving and setting a value at a particular position just like a normal matrix. The implementation should handle the case of whether a value is zero or not and act appropriately, including insertion of indices into the storage vectors. To better support this kind of usage you may want to add support for mutable views, and think more carefully about the internal representation.

Speaking of views, it would be very nice to have views that are arbitrary sub-regions of the matrix/vector, as well as regions that extend past the boundaries of the original matrix/vector, returning zeroes for out-of-bounds locations.

Another thing that would need some careful thought would be support for reusing allocated storage and preallocating. This would would improve performance for cases where a particular routine must operate several times, and could eliminate repeated heap (de)allocations.

I'll add more thoughts to this issue if I think of anything else.

dylanede avatar Jan 12 '16 13:01 dylanede

After trying out the library for a bit, I've found some places in the API that are a bit painful to work with at the moment. Number one is the (lack of) interop with dense matrices and vectors. It would be very nice to be able to construct a sparse array from a Vec<N>, for example, or dot product a sparse vector with a Vec<N>. I won't mention too much more about dense storage interop since you've got some issues opened about this.

Indeed these things can be great. The tricky part of these is that ideally I'd like to be able (for vectors at least) have functions that can operate on either Vec as input, or a dense array (eg ndarray's arrays). There's probably going to be a trait for that.

One set of very useful things that is missing is support for per-element operations between matrices/vectors.

Yes indeed it's a must have.

Another more basic set of things that is missing is more operations on elements of matrices/vectors. E.g. retrieving and setting a value at a particular position just like a normal matrix. The implementation should handle the case of whether a value is zero or not and act appropriately, including insertion of indices into the storage vectors. To better support this kind of usage you may want to add support for mutable views, and think more carefully about the internal representation.

This one is tricky. I've recently added the triplet matrix, which enables this kind of operations with a lower runtime cost, but I'm not convinced this is a feature I want for the csc/csr matrices. It exists in eigen, but I've really found its usage quite clumsy. In my opinion once you have a compressed matrix you don't want to change its structure, only its values, which lets you benefit from symbolic factorizations. Being able to guarantee that at the type level seems nice.

The path I would prefer here would be to support that use case in another matrix format, probably akin to scipy's Dictionnary Of Keys, and have easy conversions between the formats.

Speaking of views, it would be very nice to have views that are arbitrary sub-regions of the matrix/vector, as well as regions that extend past the boundaries of the original matrix/vector, returning zeroes for out-of-bounds locations.

Yes that's a feature I want to have too. But it'll have to wait a bit, I want to get used to ndarray's indexing syntax beforehand, it could be great to use the same syntax for this feature.

Another thing that would need some careful thought would be support for reusing allocated storage and preallocating. This would would improve performance for cases where a particular routine must operate several times, and could eliminate repeated heap (de)allocations.

As it turns out that's one of my design goals. I try to always have allocation-free implementations, but I want to wrap those implementations into user-friendlier-but-allocating APIs later on (as in the cholesky module for instance). Is there any place in particular where you saw the need for this pattern?

One last thing, could you open one issue for each of these points? It would be easier for me to track it down.

Thanks again for your feedback.

vbarrielle avatar Jan 12 '16 19:01 vbarrielle

FYI I've started splitting this issue into multiple single issues to better track progress. I'll link the issues here.

vbarrielle avatar Feb 09 '16 18:02 vbarrielle

Another more basic set of things that is missing is more operations on elements of matrices/vectors. E.g. retrieving and setting a value at a particular position just like a normal matrix. The implementation should handle the case of whether a value is zero or not and act appropriately, including insertion of indices into the storage vectors. To better support this kind of usage you may want to add support for mutable views, and think more carefully about the internal representation.

Mutable views ticket: #56 Working on it right now in this branch: https://github.com/vbarrielle/sprs/tree/CsMatViewMut

Insertion ticket: #57

vbarrielle avatar Feb 09 '16 18:02 vbarrielle

Speaking of views, it would be very nice to have views that are arbitrary sub-regions of the matrix/vector, as well as regions that extend past the boundaries of the original matrix/vector, returning zeroes for out-of-bounds locations.

Related issue in #58.

vbarrielle avatar Feb 09 '16 18:02 vbarrielle

One set of very useful things that is missing is support for per-element operations between matrices/vectors. For example if I want to multiply one vector by a constant and add it to another. I think the most practical way for the time being of supporting per-element operations would be to add some functions with differing numbers of arguments that take some borrowed matrices/vectors, and a closure to operate on them to produce the result, or maybe write into an existing matrix/vector. For example, something like

Related issue #59.

vbarrielle avatar Feb 09 '16 18:02 vbarrielle

It would be very nice to be able to construct a sparse array from a Vec<N>, for example, or dot product a sparse vector with a Vec<N>.

See issue #60.

vbarrielle avatar Feb 09 '16 18:02 vbarrielle