addons Feature Request: Add NEAT (Nash-Equilibrium Adaptive Training) Optimizer

Description: Neural network optimization for billion-parameter models faces critical gradient conflict issues where parameter updates across different layers interfere destructively, leading to slower convergence, higher variance, and resource inefficiency. NEAT (Nash-Equilibrium Adaptive Training) addresses this by modeling neural network optimization as a multi-agent game governed by Nash equilibrium principles, treating each layer as a rational agent. This game-theoretic optimizer achieves significantly faster convergence, improved stability, and substantial resource and environmental savings.

Key Contributions (from 2025 TJAS research paper by Goutham Ronanki):

Nash Gradient Equilibrium (NGE): Each layer acts as a rational player; gradients are projected onto the Nash equilibrium manifold using the network's graph Laplacian, reducing destructive gradient interference.
NG-Adam: Integrates NGE with Adam by adding equilibrium correction to momentum estimation.
Nash Step Allocation (NSA): Layerwise adaptive learning rates increase for well-aligned gradients, decrease for high-conflict layers.
Empirical Results:
- 28% faster convergence (32,400 vs. 45,000 steps; Adam baseline).
- 20% reduction in GPU hours, with proportional cost and carbon savings (8–10 metric tons CO₂/run).
- Dramatic reduction in layer gradient conflicts (mean cosine similarity: Adam -0.12 → NEAT +0.08).
- Consistent benefits scale with larger models (improvement grows from 16% @50M to 31% @1.2B params).
- All results statistically significant (p < 0.001, Cohen's d > 0.8).

Algorithmic Sketch (from paper Appendix):

# NEAT Nash-Equilibrium Adaptive Training
for batch in training_data:
    G = compute_gradients(model, batch)
    L = graph_laplacian(model_structure)
    G_equil = (I - mu * L) @ G
    m = beta1 * m + (1 - beta1) * G_equil
    v = beta2 * v + (1 - beta2) * (G_equil ** 2)
    eta_i = eta / (1 + ||L G_i||)  # Nash Step Allocation
    param -= eta_i * m / (sqrt(v) + eps)

Implementation Plan:

tf.keras native optimizer integrating NGE, NG-Adam, and NSA
Laplacian construction for neural architectures
Full usage/benchmark notebooks
Empirical validation pipeline on open datasets (text, vision)

References:

Ronanki, G. Nash-Equilibrium Adaptive Training (NEAT). TJAS, 2025 (full PDF attached, see GitHub)
https://github.com/ItCodinTime/neat-optimizer

Theoretical background, further results, and step-by-step algorithmic descriptions are included in the attached PDF (see repo). Please review and advise on desired API/interface for TF Addons inclusion.

Oct 13 '25 18:10 ItCodinTime

This project is no longer maintained or updated.

Oct 15 '25 08:10 sun1638650145

Is there another project where I could get my optimizer onto TF? Any way you could erfer me or give me feedback @sun1638650145?

Oct 15 '25 22:10 ItCodinTime

There's not much you can do if you insist on using tensorflow you'll have to implement this optimizer yourself.

Oct 16 '25 07:10 sun1638650145

Hi @sun1638650145, thank you for the response? Could you please give me steps or link a tutorial or guide me in the right way to do this!

Oct 16 '25 13:10 ItCodinTime

So specifically what I am trying to do is put my optimizer in the next release of tensorflow alongside with adam so others can use it.

Oct 16 '25 20:10 ItCodinTime

I don't recommend you do that, because you've already implemented it in pytorch (which is sufficient). It seems that the official tensorflow team isn't maintaining it much anymore and hardly anyone uses it now.

Oct 17 '25 03:10 sun1638650145

I have several issues and PRs that have been pending for over half a year without any response from the tensorflow team. I'm just suggesting that you don't waste your time on it.

Oct 17 '25 03:10 sun1638650145

Alright then! Thank you so much for your time @sun1638650145

Oct 17 '25 16:10 ItCodinTime