openfl Fix fedprox MNIST tutorial for pytorch

Summary

Models trained with different mu values in the FedProx optimizer appear to have identical weights, despite the expectation that different mu values should produce different models. This suggests that the proximal term in the FedProx algorithm is not being applied correctly.

Type of Change (Mandatory)

Specify the type of change being made.

Bug fix : https://github.com/securefederatedai/openfl/issues/823

Description (Mandatory)

1. Incorrect Timing of the `set_old_weights` Call

In the current notebook implementation, set_old_weights is called after gradient computation but right before the optimizer step. This is problematic because:

It sets the reference weights (w_old) to be nearly identical to the current weights
When the optimizer step is executed, the proximal term mu * (param - w_old_param) becomes effectively zero
This nullifies the effect of different mu values, resulting in identical training outcomes

2. Missing Initial `w_old` Value in Optimizer

The optimizers (FedProxOptimizer and FedProxAdam) don't initialize the w_old parameter in their constructors. If step() is called before set_old_weights, there may be an error accessing the uninitialized w_old .

Testing

Tested locally.

Screenshot 2025-05-19 113316

May 19 '25 06:05 nisha987

Thanks @nisha987 This is looking good. I had one more minor comment regarding mu Can you also rebase and signoff on your commits? https://github.com/securefederatedai/openfl/pull/1634/checks?check_run_id=42564995654

May 20 '25 17:05 kminhta

Thanks @nisha987 This is looking good. I had one more minor comment regarding mu Can you also rebase and signoff on your commits? https://github.com/securefederatedai/openfl/pull/1634/checks?check_run_id=42564995654

added signoff to all commits.

May 21 '25 04:05 nisha987

Fix fedprox MNIST tutorial for pytorch

Summary

Type of Change (Mandatory)

Description (Mandatory)

1. Incorrect Timing of the set_old_weights Call

2. Missing Initial w_old Value in Optimizer

Testing

1. Incorrect Timing of the `set_old_weights` Call

2. Missing Initial `w_old` Value in Optimizer