openfl icon indicating copy to clipboard operation
openfl copied to clipboard

Update TensorFlow Task Runner and related workspaces

Open kta-intel opened this issue 8 months ago • 3 comments

Related Issue: #973

Summary: This PR aims to update the TensorFlow Task Runner to use Keras as the high-level API, which is in line with best practices as well as updates existing TF workspaces. This enables the usage of non-legacy optimizers (which will be deprecated in future versions of TF/Keras)

Specifically, this PR:

  • Creates a new TensorFlowTaskRunner class in openfl.federated.task.runner_tf which borrows heavily from the KerasTaskRunner task. Major difference is in handling the weights of the optimizer which was necessitated by the removal of the .get_weight() and .weights() attributes from the optimizer. This new TensorFlowTaskRunner extracts weights from the .variables() attribute
    • Also updated the train and validation task names to train_validation and task_validation to be consistent with the torch taskrunner
  • Archived old TensorFlowTaskRunner as TensorFlowTaskRunner_v1 within openfl.federated.task.runner_tf and updated the __init__ files to make it callable. Rationale is to avoid any breaking changes for tutorials or upstream applications that still relied on the low-level TF taskrunner. This can be removed entirely in a future release as needed
    • Also updated the train and validation task names to train_validation and task_validation to be consistent with the torch taskrunner
  • Created a new tf_cnn_mnist workspace and updated the torch_cnn_histology workspace to run on the new TensorFlowTaskRunner using the src/dataloader.py and src/taskrunner.py convention.
    • update to TensorFlow v2.15.1 (latest TensorFlow to not use Keras v3.x by default)
  • Minor tf_3dunet_brats to use new TensorFlowTaskRunner (did not make changes to src files because I did not have Brats3D dataset to verify a large update
  • Minor updates to tf_2dunet to run on archived TensorFlowTaskRunner_v1

Future work

  • Consolidation step still needed:
    • Migrate all tf_2d_unet from TensorFlowTaskRunner_v1 to new TensorFlowTaskRunner
    • Migrate all keras workspaces from KerasTaskRunner to new TensorFlowTaskRunner and remove/archive KerasTaskRunner
  • Look into updated TensorFlowTaskRunner to run on TF v2.16+ with Keras 3.x (this may need some large changes to weight handling that will likely not have backwards compatibility)

kta-intel avatar Jun 06 '24 22:06 kta-intel