openfl icon indicating copy to clipboard operation
openfl copied to clipboard

[Feature Request] [Workflow API] Enable users to import helper functions / utilities from a separate python script in FederatedRuntime Notebooks

Open refai06 opened this issue 7 months ago • 1 comments

Is your feature request related to a problem? Please describe.

The issue related to #1565, where the User encountered a problem while importing user-defined modules in Jupyter notebooks while running a FederatedRuntime experiment

In the current implementation of Workflow API, jupyter notebook is expected to define the Federated Learning experiment in it's entirety. If the user attempts to import a user defined module from a different python script it will fail due to following reasons:

  • When the notebook is exported, the script inside the generated_workspace does not contain the user defined code
  • This will lead to ModuleNotFoundError failure during execution on participants in a distributed infrastructure

Describe the solution you'd like

Enable users to import helper functions from a separate python script. For e.g. FL experiment tutorial: crowd_guard.ipynb is importing some helper functions / classes from user-defined script validation.py in a folder workspace

workspace
├── crowd_guard.ipynb 
└── validation.py

validation.py (contains helper class)

class CrowdGuardClientValidation:

    def __distance_global_model_final_metric(distance_type: str, prediction_matrix,
                                             prediction_global_model, sample_indices_by_label,
                                             own_index):

    def __predict_for_single_model(model, local_data, device):

    def __do_predictions(models, global_model, local_data, device):

    def __prune_poisoned_models(num_layers, total_number_of_clients, own_client_index,
                                distances_by_metric, verbose=False):

    def validate_models(global_model, models, own_client_index, local_data, device):

CrowdGuard.ipynb (Jupyter notebook for Workflow API experiment)

#| export

from validation import CrowdGuardClientValidation

class FederatedFlow_CrowdGuard(FLSpec):

    @aggregator
    def start(self):

    @collaborator
    def train(self):

    @collaborator
    def local_validation(self):
        ...
        detected_suspicious_models = CrowdGuardClientValidation.validate_models(self.global_model,
                                                                                all_models,
                                                                                own_client_index,
                                                                                self.train_loader,
                                                                                self.device)
	...
	

    @aggregator
    def end(self): 

To support this use case, existing export process in notebook_tools needs to be enhanced to

  1. Analyse all imports in Jupyter Notebook and identify user-defined imports
  2. Copy user defined python scripts / folders containing these imports into the generated_workpace

This shall ensure that generated_workspace (shown below) includes all user-defined code and ensure that it works on the distributed infrastructure

workspace
├── generated_workspace
│	├── src
│	│   ├── __init__.py
│	│   ├── experiment.py
│	│   └── validation.py 
│	├── .workspace
│	├── plan
│	│   └── plan.yaml
│	└── requirements.txt
├── crowd_guard.ipynb 
└── validation.py

Describe alternatives you've considered

N.A.

Additional context

This enhancement shall be based on following Requirements & Guidelines:

  • Export Directives

    • User-defined imports should be present in a notebook cell that is annotated by #| export directive as the first line
    • Rationale:
      • #| export directives are required to export the user-defined imports to exported script and further processing
  • User-defined scripts should not install any packages:

    • User-defined scripts should not install any package
    • Rationale:
      • While the exported script is analyzed to identify dependencies and build the requirements.txt for the FL experiment, User-defined scripts are not analyzed by the infrastructure to identify dependencies
  • Location of User defined python scripts:

    • User-defined modules must be placed in the same directory as the Jupyter Notebook to enable the infrastructure to correctly locate and copy these modules into the generate_workspace
    • Rationale:
      • Custom Path Dependencies: A user-defined module located
        /home/users
        └── fl_helper
            ├── __init__.py
            └── validation.py
        
        CrowdGuard.ipynb (Jupyter notebook for Workflow API experiment)
        ...
        sys.path.append('/home/user/fl_helpers')
        
        from validation import CrowdGuardClientValidation
        
      • Importing from a custom path requires explicit modification of sys.path, which is not recommended and can lead to inconsistency across distributed system
      • Relying on custom paths or module locations outside the notebook directory, which adds complexity for the infrastructure in identifying and accessing the required user-defined modules
      • Placing the modules in the same directory as the Jupyter Notebook streamlines the process, simplifies access, and eliminates the need to modify sys.path Example:
        workspace
        ├── crowd_guard.ipynb 
        └── fl_helper
            ├── __init__.py
            └── validation.py
        
  • Restrictions on User-defined imports

    • User defined code should not modify the sys.path to enable python to find the scripts to import. For e.g.
      workspace
      ├── crowd_guard.ipynb 
      └── helper
           └── validation.py
      
      CrowdGuard.ipynb (Jupyter notebook for Workflow API experiment)
      ...
      ...
      sys.path.append('./helper')
      
      from validation import CrowdGuardClientValidation
      ...
      
      Recommended_Usage
      from helper.validation import CrowdGuardClientValidation
      
  • User-defined imports should be self-contained:

    • User defined code should not import other user-defined code from different python scripts. For e.g.

      utils.py (contains additional helper functions)

      def calculate_accuracy(predictions, labels):
            correct = (predications == labels).sum()
            return correct / len(labels)
      

      validation.py (contains helper functions)

      from utils import calculate_accuracy
      
      class CrowdGuardClientValidation:
            def validate_models(global_model, models, own_client_index, local_data, device):
            ...
            accuracy = calculate_accuracy(predictions, labels)
      

refai06 avatar Jun 03 '25 12:06 refai06

I've run into the exact same issue and I also think that this functionality would be quiet handy. Especially in bigger projects.

Also I asked about this in another thread a few days a ago and the response was https://github.com/securefederatedai/openfl/issues/1565#issuecomment-2899848065 :

At present, importing user-defined modules from separate Python files is not supported and [...] We will [...] consider the possibility of supporting this functionality in a future release

tayfunceylan avatar Jun 03 '25 20:06 tayfunceylan