AiDotNet icon indicating copy to clipboard operation
AiDotNet copied to clipboard

feat: implement sigmoid family gradient computations

Open ooples opened this issue 1 month ago • 1 comments

Summary

Implements mathematically correct gradient (backward pass) computations for 9 Sigmoid family activation functions in TensorOperations.cs. This enables proper backpropagation through these activations for neural network training.

Changes

Implemented Gradients

  1. Swish/SiLU: f'(x) = σ(x) + x * σ(x) * (1 - σ(x))
  2. Mish: f'(x) = tanh(softplus(x)) + x * sech²(softplus(x)) * σ(x)
  3. Softplus: f'(x) = σ(x)
  4. SoftSign: f'(x) = 1 / (1 + |x|)²
  5. HardSigmoid: f'(x) = 0.2 if -2.5 < x < 2.5, else 0
  6. HardTanh: f'(x) = 1 if -1 < x < 1, else 0
  7. ScaledTanh: f'(x) = α * β * (1 - tanh²(β * x))
  8. BentIdentity: f'(x) = x / (2 * sqrt(x² + 1)) + 1
  9. Identity: Already working (pass-through gradient)

Technical Details

  • All gradients follow the proper accumulation pattern using a.Gradient property
  • Null-check and Add() method for gradient accumulation (supports multiple uses of same node)
  • Compatible with all target frameworks: net462, net471, netstandard2.0, net8.0
  • No use of null-forgiving operator (!)
  • Build succeeds with 0 errors

Testing

  • Build verified: 0 errors across all target frameworks
  • Manual verification of gradient formulas against mathematical definitions
  • Follows existing code patterns in TensorOperations.cs

Dependencies

  • Depends on Agent 5's work (TensorOperations activation methods) ✅ Complete
  • Blocks: Agent 9's interface architecture (SupportsJitCompilation property)

Notes

The SupportsJitCompilation property update for activation files (SwishActivation.cs, MishActivation.cs, etc.) will be done by Agent 9 as part of the interface architecture changes. This PR focuses solely on the gradient implementations.

🤖 Generated with Claude Code

ooples avatar Nov 24 '25 00:11 ooples

[!WARNING]

Rate limit exceeded

@ooples has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 4 minutes and 42 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between fdc5dd725f20c9d38cb1d313755e61b2e4fc5174 and 7ca80b8b604f4a440e00ffbd782bc0a0f1d18d94.

📒 Files selected for processing (1)
  • src/Autodiff/TensorOperations.cs (1 hunks)
✨ Finishing touches
  • [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment
  • [ ] Commit unit tests in branch feat/sigmoid-family-gradients

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Nov 24 '25 00:11 coderabbitai[bot]