juice icon indicating copy to clipboard operation
juice copied to clipboard

Fix Linear layer bias gradient computation; add size checks to CUDA functions

Open hweom opened this issue 2 years ago • 2 comments

What does this PR accomplish?

  • 🩹 Bug Fix

Closes #169

Changes proposed by this PR:

  1. Add an assert to CUDA copy() function to check that source and destination have the same size.
  2. Add an assert to CUDA gemm() function to check that computed matrix multiplication dimensions match the passed tensor sizes.
  3. Fix the Linear layer bias gradient computation by summing all gradients in the batch instead of copying the output gradient to the bias gradient (which is incorrect since the output gradient contains N items, not 1).

After this change, cuda-memcheck no longer finds errors.

Notes to reviewer:

All unit tests continue to pass except ui which is also broken at HEAD.

📜 Checklist

  • [x] The juice-examples run just fine

hweom avatar Jul 28 '22 00:07 hweom

Test fixes are in #172.

hweom avatar Aug 11 '22 02:08 hweom

Could you rebase this PR onto master? I'll merge right away then

drahnr avatar Aug 11 '22 07:08 drahnr

Done.

hweom avatar Aug 14 '22 01:08 hweom