feat: implement sigmoid family gradient computations
Summary
Implements mathematically correct gradient (backward pass) computations for 9 Sigmoid family activation functions in TensorOperations.cs. This enables proper backpropagation through these activations for neural network training.
Changes
Implemented Gradients
- Swish/SiLU:
f'(x) = σ(x) + x * σ(x) * (1 - σ(x)) - Mish:
f'(x) = tanh(softplus(x)) + x * sech²(softplus(x)) * σ(x) - Softplus:
f'(x) = σ(x) - SoftSign:
f'(x) = 1 / (1 + |x|)² - HardSigmoid:
f'(x) = 0.2 if -2.5 < x < 2.5, else 0 - HardTanh:
f'(x) = 1 if -1 < x < 1, else 0 - ScaledTanh:
f'(x) = α * β * (1 - tanh²(β * x)) - BentIdentity:
f'(x) = x / (2 * sqrt(x² + 1)) + 1 - Identity: Already working (pass-through gradient)
Technical Details
- All gradients follow the proper accumulation pattern using
a.Gradientproperty - Null-check and
Add()method for gradient accumulation (supports multiple uses of same node) - Compatible with all target frameworks: net462, net471, netstandard2.0, net8.0
- No use of null-forgiving operator (!)
- Build succeeds with 0 errors
Testing
- Build verified: 0 errors across all target frameworks
- Manual verification of gradient formulas against mathematical definitions
- Follows existing code patterns in TensorOperations.cs
Dependencies
- Depends on Agent 5's work (TensorOperations activation methods) ✅ Complete
- Blocks: Agent 9's interface architecture (SupportsJitCompilation property)
Notes
The SupportsJitCompilation property update for activation files (SwishActivation.cs, MishActivation.cs, etc.) will be done by Agent 9 as part of the interface architecture changes. This PR focuses solely on the gradient implementations.
🤖 Generated with Claude Code
[!WARNING]
Rate limit exceeded
@ooples has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 4 minutes and 42 seconds before requesting another review.
⌛ How to resolve this issue?
After the wait time has elapsed, a review can be triggered using the
@coderabbitai reviewcommand as a PR comment. Alternatively, push new commits to this PR.We recommend that you space out your commits to avoid hitting the rate limit.
🚦 How do rate limits work?
CodeRabbit enforces hourly rate limits for each developer per organization.
Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.
Please see our FAQ for further information.
📥 Commits
Reviewing files that changed from the base of the PR and between fdc5dd725f20c9d38cb1d313755e61b2e4fc5174 and 7ca80b8b604f4a440e00ffbd782bc0a0f1d18d94.
📒 Files selected for processing (1)
src/Autodiff/TensorOperations.cs(1 hunks)
✨ Finishing touches
- [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
- [ ] Create PR with unit tests
- [ ] Post copyable unit tests in a comment
- [ ] Commit unit tests in branch
feat/sigmoid-family-gradients
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.