[wip] Proof of concept that torch ops in nightly will pass tests + identify gaps in what's in core

Gaps with no workarounds (fixes are wip):

[x] integer support
[ ] ~~missing arg_* variants in core (argmin, argmax) (this is not exposed by the torch_scatter documentation)~~
[ ] no support for segment_coo yet (which does not seem to be used by pytorch_scatter/pytorch_geometric)

Gaps that have workarounds:

[ ] when using torch.scatter/torch.segment_reduce, inplace operations in composite ops break autograd, see softmax for example, so we need to do many wasteful copies we could support logsumexp/softmax as options in the kernel to avoid this
[ ] empty segments are filled with reduction inits in core (current workaround is using torch.nan_to_num)
- this is only a good workaround assuming that the data itself has no nans/infs/-infs, or that even if these exist in the data overwriting them is ok
[ ] [easy] remove data.numel() > 0 check in core for segment_reduce (will be part of porting SegmentReduce to TensorIterator)

Next steps:

[x] run benchmarking script before and after PR rendered benchmarking results [cuda after] [cuda before from 303] [cuda speedups]
- [x] ~~Currently blocked on this as running the benchmarking script before the PR is leading to an illegal CUDA memory access (for options other than mean and sum which call into core ops on master)~~ #303

Findings:

Our backward implementation in core is slow, this might be because the gradient formulas in core are a bit different from those here
- we evenly distribute the gradient as in torch.min/torch.max when multiple elements correspond to the max
- we handle the case for prod when an element is 0 and gradient * result / out will not accurately propagate the gradient for 0s
- [ ] more investigation is needed regarding which parts of the backward function are the bottlenecks, writing a standalone backward kernel was a potential next step that could speed this up
include_self flag in core is slow, this makes sense because when include_self=False there are 2 scatters being done
- [ ] This needs to be wrapped into the kernel in core

Jun 15 '22 22:06 mikaylagawarecki

Codecov Report

Merging #301 (4bd199f) into master (847995d) will increase coverage by 0.43%. The diff coverage is 98.73%.

@@            Coverage Diff             @@
##           master     #301      +/-   ##
==========================================
+ Coverage   97.54%   97.98%   +0.43%     
==========================================
  Files           9        9              
  Lines         204      248      +44     
==========================================
+ Hits          199      243      +44     
  Misses          5        5

Impacted Files	Coverage Δ
torch_scatter/utils.py	`95.65% <91.66%> (-4.35%)`	:arrow_down:
torch_scatter/composite/logsumexp.py	`90.90% <100.00%> (ø)`
torch_scatter/composite/softmax.py	`100.00% <100.00%> (ø)`
torch_scatter/scatter.py	`100.00% <100.00%> (+2.00%)`	:arrow_up:
torch_scatter/segment_csr.py	`100.00% <100.00%> (ø)`

:mega: Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

Jun 15 '22 22:06 codecov-commenter

This pull request had no activity for 6 months. It will be closed in 2 weeks unless there is some new activity.

Dec 25 '22 01:12 github-actions[bot]

pytorch_scatter
pytorch_scatter copied to clipboard

[DO NOT REVIEW] Testing integration

Codecov Report

pytorch_scatter pytorch_scatter copied to clipboard

[DO NOT REVIEW] Testing integration

Codecov Report

pytorch_scatter
pytorch_scatter copied to clipboard