Description

Distributed Compressed Sparse Row Matrix: Dcsr_matrix

A format for the efficient storage and manipulation of sparse data (with majority 0s). This distributed implementation builds upon the torch.sparse_csr_tensor which is used as the process local storage. It supports distribution along the axis 0 (rows). Other axes are omitted since they do not work well with this format. API closely mimics the scipy sparse library.

ht.sparse.sparse_csr_matrix is the sparse alternative of ht.array method. It takes either torch.sparse_csr_tensor or scipy.sparse.csr_matrix as input and generates a Dcsr_matrix.

Has a working binary operator for element-wise operations. Currently only supports addition and multiplication. Also, only the float datatype is supported in these operations due to the use of torch.sparse_csr_tensors.

Can be converted to the dense format (a DNDarray) using the todense method.

Further work:

Exhaustive tests
Other element-wise operations like bitwise and, or, etc...
Matrix multiplication

Project Description: GSoC Project Idea - 2

Type of change

New feature

Due Diligence

[ ] All split configurations tested
[ ] Multiple dtypes tested in relevant functions
[ ] Documentation updated (if needed)
[ ] Updated changelog.md under the title "Pending Additions"

Does this change modify the behaviour of other functions? If so, which?

yes / no

skip ci

Sep 17 '22 15:09 Mystic-Slice

👇 Click on the image for a new way to code review

Make big changes easier — review code in small groups of related files
Know where to start — see the whole change at a glance
Take a code tour — explore the change with an interactive tour
Make comments and review — all fully sync’ed with github

Try it now!

Legend

CodeSee Map Legend

Sep 17 '22 15:09 ghost

Failed test on one process

=================================== FAILURES ===================================
_________________________ TestDcsr_matrix.test_larray __________________________
self = <heat.sparse.tests.test_dcsrmatrix.TestDcsr_matrix testMethod=test_larray>
    def test_larray(self):
        heat_sparse_csr = ht.sparse.sparse_csr_matrix(self.ref_torch_sparse_csr)
    
        self.assertIsInstance(heat_sparse_csr.larray, torch.Tensor)
        self.assertEqual(heat_sparse_csr.larray.layout, torch.sparse_csr)
        self.assertEqual(heat_sparse_csr.larray.shape, heat_sparse_csr.lshape)
        self.assertEqual(heat_sparse_csr.larray.shape, heat_sparse_csr.gshape)
    
        # Distributed case
        heat_sparse_csr = ht.sparse.sparse_csr_matrix(self.ref_torch_sparse_csr, split=0)
    
        self.assertIsInstance(heat_sparse_csr.larray, torch.Tensor)
        self.assertEqual(heat_sparse_csr.larray.layout, torch.sparse_csr)
        self.assertEqual(heat_sparse_csr.larray.shape, heat_sparse_csr.lshape)
>       self.assertNotEqual(heat_sparse_csr.larray.shape, heat_sparse_csr.gshape)
E       AssertionError: torch.Size([5, 5]) == (5, 5)
heat/sparse/tests/test_dcsrmatrix.py:44: AssertionError

Sep 23 '22 07:09 mtar

Codecov Report

Merging #1028 (6d45bd4) into main (9cce973) will increase coverage by 0.00%. The diff coverage is 91.76%.

@@           Coverage Diff            @@
##             main    #1028    +/-   ##
========================================
  Coverage   91.75%   91.76%            
========================================
  Files          65       72     +7     
  Lines       10024    10352   +328     
========================================
+ Hits         9198     9499   +301     
- Misses        826      853    +27

Flag	Coverage Δ
unit	`91.76% <91.76%> (+<0.01%)`	:arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
heat/core/_operations.py	`96.04% <ø> (ø)`
heat/sparse/_operations.py	`76.56% <76.56%> (ø)`
heat/sparse/dcsr_matrix.py	`93.93% <93.93%> (ø)`
heat/sparse/factories.py	`95.29% <95.29%> (ø)`
heat/__init__.py	`100.00% <100.00%> (ø)`
heat/core/communication.py	`96.21% <100.00%> (+0.01%)`	:arrow_up:
heat/sparse/__init__.py	`100.00% <100.00%> (ø)`
heat/sparse/arithmetics.py	`100.00% <100.00%> (ø)`
heat/sparse/manipulations.py	`100.00% <100.00%> (ø)`
heat/sparse/tests/__init__.py	`100.00% <100.00%> (ø)`

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

Sep 23 '22 08:09 codecov[bot]

@ClaudiaComito I have made all the changes you requested. Thanks for your review! And, yes. a to_sparse method would be amazing. I will work on it in a separate PR. Hoping that's alright.

Nov 15 '22 17:11 Mystic-Slice

heat heat copied to clipboard

Distributed Compressed Sparse Row Matrix

Description

Distributed Compressed Sparse Row Matrix: Dcsr_matrix

Further work:

Type of change

Due Diligence

Does this change modify the behaviour of other functions? If so, which?

Legend

Codecov Report

heat
heat copied to clipboard