heat
heat copied to clipboard
Distributed Compressed Sparse Row Matrix
Description
Distributed Compressed Sparse Row Matrix: Dcsr_matrix
A format for the efficient storage and manipulation of sparse data (with majority 0s). This distributed implementation builds upon the torch.sparse_csr_tensor which is used as the process local storage. It supports distribution along the axis 0 (rows). Other axes are omitted since they do not work well with this format. API closely mimics the scipy sparse library.
ht.sparse.sparse_csr_matrix
is the sparse alternative of ht.array
method. It takes either torch.sparse_csr_tensor or scipy.sparse.csr_matrix as input and generates a Dcsr_matrix
.
Has a working binary operator for element-wise operations. Currently only supports addition and multiplication. Also, only the float
datatype is supported in these operations due to the use of torch.sparse_csr_tensor
s.
Can be converted to the dense format (a DNDarray) using the todense
method.
Further work:
- Exhaustive tests
- Other element-wise operations like bitwise and, or, etc...
- Matrix multiplication
Project Description: GSoC Project Idea - 2
Type of change
- New feature
Due Diligence
- [ ] All split configurations tested
- [ ] Multiple dtypes tested in relevant functions
- [ ] Documentation updated (if needed)
- [ ] Updated changelog.md under the title "Pending Additions"
Does this change modify the behaviour of other functions? If so, which?
yes / no
skip ci
👇 Click on the image for a new way to code review
-
Make big changes easier — review code in small groups of related files
-
Know where to start — see the whole change at a glance
-
Take a code tour — explore the change with an interactive tour
-
Make comments and review — all fully sync’ed with github
Legend
Failed test on one process
=================================== FAILURES ===================================
_________________________ TestDcsr_matrix.test_larray __________________________
self = <heat.sparse.tests.test_dcsrmatrix.TestDcsr_matrix testMethod=test_larray>
def test_larray(self):
heat_sparse_csr = ht.sparse.sparse_csr_matrix(self.ref_torch_sparse_csr)
self.assertIsInstance(heat_sparse_csr.larray, torch.Tensor)
self.assertEqual(heat_sparse_csr.larray.layout, torch.sparse_csr)
self.assertEqual(heat_sparse_csr.larray.shape, heat_sparse_csr.lshape)
self.assertEqual(heat_sparse_csr.larray.shape, heat_sparse_csr.gshape)
# Distributed case
heat_sparse_csr = ht.sparse.sparse_csr_matrix(self.ref_torch_sparse_csr, split=0)
self.assertIsInstance(heat_sparse_csr.larray, torch.Tensor)
self.assertEqual(heat_sparse_csr.larray.layout, torch.sparse_csr)
self.assertEqual(heat_sparse_csr.larray.shape, heat_sparse_csr.lshape)
> self.assertNotEqual(heat_sparse_csr.larray.shape, heat_sparse_csr.gshape)
E AssertionError: torch.Size([5, 5]) == (5, 5)
heat/sparse/tests/test_dcsrmatrix.py:44: AssertionError
Codecov Report
Merging #1028 (6d45bd4) into main (9cce973) will increase coverage by
0.00%
. The diff coverage is91.76%
.
@@ Coverage Diff @@
## main #1028 +/- ##
========================================
Coverage 91.75% 91.76%
========================================
Files 65 72 +7
Lines 10024 10352 +328
========================================
+ Hits 9198 9499 +301
- Misses 826 853 +27
Flag | Coverage Δ | |
---|---|---|
unit | 91.76% <91.76%> (+<0.01%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
heat/core/_operations.py | 96.04% <ø> (ø) |
|
heat/sparse/_operations.py | 76.56% <76.56%> (ø) |
|
heat/sparse/dcsr_matrix.py | 93.93% <93.93%> (ø) |
|
heat/sparse/factories.py | 95.29% <95.29%> (ø) |
|
heat/__init__.py | 100.00% <100.00%> (ø) |
|
heat/core/communication.py | 96.21% <100.00%> (+0.01%) |
:arrow_up: |
heat/sparse/__init__.py | 100.00% <100.00%> (ø) |
|
heat/sparse/arithmetics.py | 100.00% <100.00%> (ø) |
|
heat/sparse/manipulations.py | 100.00% <100.00%> (ø) |
|
heat/sparse/tests/__init__.py | 100.00% <100.00%> (ø) |
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
@ClaudiaComito I have made all the changes you requested. Thanks for your review! And, yes. a to_sparse
method would be amazing. I will work on it in a separate PR. Hoping that's alright.