Description

When reshaping distributed DNDarrays:

if new_split is the same as the original split, and
if distribution (lshapes) allows then reshape locally via pytorch, stitch local_reshaped tensors along split axis, and balance.

This allows us to bypass the memory-intensive implementation of the distributed reshape in many cases.

Example:

tracemalloc.start()
t_x = torch.arange(100000).reshape(10,-1,10)
x = ht.array(t_x, split=1)
current, peak = tracemalloc.get_traced_memory()
print(f"BEFORE RESHAPE: Current memory usage is {current / 10**6}MB; Peak was {peak / 10**6}MB")

start = time.perf_counter()
t_x = t_x.reshape(10, -1)
end = time.perf_counter()
current, peak = tracemalloc.get_traced_memory()
print(f"after torch.reshape: Current memory usage is {current / 10**6}MB; Peak was {peak / 10**6}MB")
print("torch.reshape takes ", (end-start), " seconds.")

start = time.perf_counter()
x = x.reshape(10, -1)
end = time.perf_counter()
current, peak = tracemalloc.get_traced_memory()
print(f"after ht.reshape: Current memory usage is {current / 10**6}MB; Peak was {peak / 10**6}MB")
print("ht.reshape takes ", (end-start), " seconds.")

Results on master, 2 processes

[1,0]<stdout>:BEFORE RESHAPE: Current memory usage is 0.002501MB; Peak was 0.003077MB <---
[1,0]<stdout>:after torch.reshape: Current memory usage is 0.003669MB; Peak was 0.004101MB <---
[1,0]<stdout>:torch.reshape takes  2.068399999988202e-05  seconds.
[1,1]<stdout>:BEFORE RESHAPE: Current memory usage is 0.002501MB; Peak was 0.003105MB <---
[1,1]<stdout>:after torch.reshape: Current memory usage is 0.003669MB; Peak was 0.004101MB <---
[1,1]<stdout>:torch.reshape takes  2.2049000000023966e-05  seconds.
[1,1]<stdout>:after ht.reshape: Current memory usage is 0.372806MB; Peak was 0.383006MB <---
[1,1]<stdout>:ht.reshape takes  0.020710520999999815  seconds.
[1,0]<stdout>:after ht.reshape: Current memory usage is 0.372689MB; Peak was 0.382889MB <---
[1,0]<stdout>:ht.reshape takes  0.02076237900000022  seconds.

Results on enhancement/distributed_reshape_same_split, 2 processes:

[1,0]<stdout>:BEFORE RESHAPE: Current memory usage is 0.002501MB; Peak was 0.003077MB  <---
[1,0]<stdout>:after torch.reshape: Current memory usage is 0.003669MB; Peak was 0.004101MB <---
[1,0]<stdout>:torch.reshape takes  1.6194000000080422e-05  seconds.
[1,1]<stdout>:BEFORE RESHAPE: Current memory usage is 0.002501MB; Peak was 0.003105MB <---
[1,1]<stdout>:after torch.reshape: Current memory usage is 0.003669MB; Peak was 0.004101MB <---
[1,1]<stdout>:torch.reshape takes  1.3567999999963831e-05  seconds.
[1,0]<stdout>:after ht.reshape: Current memory usage is 0.010736MB; Peak was 0.012752MB <---
[1,0]<stdout>:ht.reshape takes  0.015495102000000038  seconds.
[1,1]<stdout>:after ht.reshape: Current memory usage is 0.010736MB; Peak was 0.01278MB <---
[1,1]<stdout>:ht.reshape takes  0.01551089800000005  seconds.

Issue/s addressed: #874

Changes proposed:

see above

Type of change

New feature (non-breaking change which adds functionality)

Due Diligence

[x] All split configurations tested
[x] Multiple dtypes tested in relevant functions
[x] Documentation updated (if needed)
[x] Updated changelog.md under the title "Pending Additions"

Does this change modify the behaviour of other functions? If so, which?

no

Sep 24 '21 13:09 ClaudiaComito

failures may be solved by #857 would need to merge to be certain

Oct 08 '21 09:10 coquelin77

Codecov Report

Merging #873 (0f4fd60) into master (293d873) will decrease coverage by 7.63%. The diff coverage is 60.00%.

@@            Coverage Diff             @@
##           master     #873      +/-   ##
==========================================
- Coverage   95.50%   87.87%   -7.64%     
==========================================
  Files          64       64              
  Lines        9579     9588       +9     
==========================================
- Hits         9148     8425     -723     
- Misses        431     1163     +732

Flag	Coverage Δ
gpu	`87.87% <60.00%> (-6.77%)`	:arrow_down:
unit	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
heat/core/manipulations.py	`92.51% <60.00%> (-6.44%)`	:arrow_down:
heat/optim/dp_optimizer.py	`13.59% <0.00%> (-82.49%)`	:arrow_down:
heat/optim/utils.py	`38.15% <0.00%> (-61.85%)`	:arrow_down:
heat/nn/data_parallel.py	`75.17% <0.00%> (-19.32%)`	:arrow_down:
heat/spatial/distance.py	`80.90% <0.00%> (-15.08%)`	:arrow_down:
heat/core/relational.py	`91.04% <0.00%> (-8.96%)`	:arrow_down:
heat/core/linalg/qr.py	`91.25% <0.00%> (-8.75%)`	:arrow_down:
heat/utils/data/partial_dataset.py	`87.17% <0.00%> (-7.18%)`	:arrow_down:
heat/cluster/spectral.py	`88.57% <0.00%> (-5.72%)`	:arrow_down:
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 293d873...0f4fd60. Read the comment docs.