AiDotNet icon indicating copy to clipboard operation
AiDotNet copied to clipboard

feat(Story-3): Add IR Operations for Sigmoid Family (Group 2)

Open ooples opened this issue 1 month ago • 1 comments

Story 3: IR Operations - Sigmoid Family

Added 10 IR operation classes for Sigmoid-family activations.

Activations Added:

  • SwishOp (uses IEngine.Swish)
  • SiLUOp (alias for Swish, uses IEngine.Swish)
  • MishOp (uses IEngine.Mish)
  • HardSigmoidOp (piecewise linear approximation)
  • HardTanhOp (piecewise linear approximation)
  • ScaledTanhOp (parameterized: a * tanh(b * x))
  • SoftplusOp (ln(1 + exp(x)), numerically stable)
  • SoftSignOp (x / (1 + |x|))
  • BentIdentityOp ((sqrt(x² + 1) - 1) / 2 + x)
  • IdentityOp (f(x) = x)

Implementation Details:

  • All classes implement IROp interface
  • Forward() uses IEngine.Swish and IEngine.Mish for GPU acceleration
  • Backward() implements gradient computation for each activation
  • Proper null checks for all parameters (no null-forgiving operators)
  • Comprehensive XML documentation with mathematical formulas
  • Numerical stability considerations (Softplus uses stable computation)

Pattern Followed:

  • Validates engine and input parameters in constructor
  • Type-safe tensor conversions with proper error messages
  • Uses IEngine methods where available (Swish, Mish)
  • ScaledTanhOp accepts parameters for flexibility

Build Status: ✅ Passed on all target frameworks (net462, net471, netstandard2.0)

Testing:

  • Build succeeded with 0 warnings, 0 errors
  • Ready for integration with DenseLayer JIT compilation

🤖 Generated with Claude Code

ooples avatar Nov 23 '25 23:11 ooples

Summary by CodeRabbit

  • New Features
    • Added support for multiple activation functions including Swish, SiLU, Mish, HardSigmoid, HardTanh, ScaledTanh, Softplus, SoftSign, BentIdentity, and Identity operations for JIT-compiled neural networks.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds a new file containing an IROp interface and ten activation operation classes (Swish/SiLU, Mish, HardSigmoid, HardTanh, ScaledTanh, Softplus, SoftSign, BentIdentity, Identity) for JIT-compiled neural networks. Each class implements forward and backward passes using an IEngine abstraction, with input validation and both analytical and numerical derivative implementations.

Changes

Cohort / File(s) Summary
New JIT Activation Operations
src/JIT/ActivationOps.cs
Introduces IROp interface with generic Forward<T> and Backward<T> methods. Adds ten activation operation classes: SwishOp, SiLUOp (alias inheriting from SwishOp), MishOp, HardSigmoidOp, HardTanhOp, ScaledTanhOp (with configurable scales), SoftplusOp, SoftSignOp, BentIdentityOp, and IdentityOp. Includes internal NumOps<T> helper providing numeric constants and conversions. All classes require IEngine for tensor operations and validate inputs as Tensor<T> or Gradient objects.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant ActivationOp as Activation Op
    participant IEngine
    participant Tensor

    Client->>ActivationOp: Forward(input)
    ActivationOp->>ActivationOp: Validate input type
    ActivationOp->>IEngine: Execute operation (e.g., Swish, Sigmoid)
    IEngine->>Tensor: Perform computation
    Tensor-->>IEngine: Result tensor
    IEngine-->>ActivationOp: Output tensor
    ActivationOp-->>Client: Return output

    Client->>ActivationOp: Backward(input, gradOutput)
    ActivationOp->>ActivationOp: Validate input & gradient types
    ActivationOp->>IEngine: Compute derivative
    IEngine->>Tensor: Derivative computation
    Tensor-->>IEngine: Gradient tensor
    IEngine-->>ActivationOp: Input gradient
    ActivationOp-->>Client: Return gradient

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Mathematical correctness: Verify each activation function's forward formula and derivative calculations (both analytical and numerical implementations)
  • Generic type constraints: Validate proper use of where T : struct constraints across all operations
  • Input validation robustness: Ensure consistent error handling for invalid Tensor<T> and Gradient inputs across all classes
  • Derivative implementations: Pay particular attention to MishOp's numerical gradient placeholder and confirm analytical derivatives for Swish, HardSigmoid, ScaledTanh, SoftSign, and BentIdentity
  • IEngine abstraction usage: Verify that all engine method calls (Swish, Sigmoid, Tanh, etc.) are correctly employed

Poem

🐰 Activations now proliferate,
From Swish to Mish, they integrate!
With engines humming, tensors flow,
Forward, backward—watch them glow! ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding IR operations for the Sigmoid family of activation functions (10 new operation classes implementing IROp interface).
Description check ✅ Passed The description is directly related to the changeset, detailing the 10 activation operations added, their implementations, and how they follow established patterns.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment
  • [ ] Commit unit tests in branch feat/activation-ir-ops-group2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Nov 23 '25 23:11 coderabbitai[bot]