[Feature Request] [Workflow API] Enable users to import helper functions / utilities from a separate python script in FederatedRuntime Notebooks
Is your feature request related to a problem? Please describe.
The issue related to #1565, where the User encountered a problem while importing user-defined modules in Jupyter notebooks while running a FederatedRuntime experiment
In the current implementation of Workflow API, jupyter notebook is expected to define the Federated Learning experiment in it's entirety. If the user attempts to import a user defined module from a different python script it will fail due to following reasons:
- When the notebook is exported, the script inside the
generated_workspacedoes not contain the user defined code - This will lead to
ModuleNotFoundErrorfailure during execution on participants in a distributed infrastructure
Describe the solution you'd like
Enable users to import helper functions from a separate python script. For e.g. FL experiment tutorial: crowd_guard.ipynb is importing some helper functions / classes from user-defined script validation.py in a folder workspace
workspace
├── crowd_guard.ipynb
└── validation.py
validation.py (contains helper class)
class CrowdGuardClientValidation:
def __distance_global_model_final_metric(distance_type: str, prediction_matrix,
prediction_global_model, sample_indices_by_label,
own_index):
def __predict_for_single_model(model, local_data, device):
def __do_predictions(models, global_model, local_data, device):
def __prune_poisoned_models(num_layers, total_number_of_clients, own_client_index,
distances_by_metric, verbose=False):
def validate_models(global_model, models, own_client_index, local_data, device):
CrowdGuard.ipynb (Jupyter notebook for Workflow API experiment)
#| export
from validation import CrowdGuardClientValidation
class FederatedFlow_CrowdGuard(FLSpec):
@aggregator
def start(self):
@collaborator
def train(self):
@collaborator
def local_validation(self):
...
detected_suspicious_models = CrowdGuardClientValidation.validate_models(self.global_model,
all_models,
own_client_index,
self.train_loader,
self.device)
...
@aggregator
def end(self):
To support this use case, existing export process in notebook_tools needs to be enhanced to
- Analyse all imports in Jupyter Notebook and identify user-defined imports
- Copy user defined python scripts / folders containing these imports into the
generated_workpace
This shall ensure that generated_workspace (shown below) includes all user-defined code and ensure that it works on the distributed infrastructure
workspace
├── generated_workspace
│ ├── src
│ │ ├── __init__.py
│ │ ├── experiment.py
│ │ └── validation.py
│ ├── .workspace
│ ├── plan
│ │ └── plan.yaml
│ └── requirements.txt
├── crowd_guard.ipynb
└── validation.py
Describe alternatives you've considered
N.A.
Additional context
This enhancement shall be based on following Requirements & Guidelines:
-
Export Directives
- User-defined imports should be present in a notebook cell that is annotated by
#| exportdirective as the first line - Rationale:
-
#| exportdirectives are required to export the user-defined imports to exported script and further processing
-
- User-defined imports should be present in a notebook cell that is annotated by
-
User-defined scripts should not install any packages:
- User-defined scripts should not install any package
- Rationale:
- While the exported script is analyzed to identify dependencies and build the
requirements.txtfor the FL experiment, User-defined scripts are not analyzed by the infrastructure to identify dependencies
- While the exported script is analyzed to identify dependencies and build the
-
Location of User defined python scripts:
- User-defined modules must be placed in the same directory as the Jupyter Notebook to enable the infrastructure to correctly locate and copy these modules into the
generate_workspace - Rationale:
- Custom Path Dependencies: A user-defined module located
CrowdGuard.ipynb (Jupyter notebook for Workflow API experiment)/home/users └── fl_helper ├── __init__.py └── validation.py... sys.path.append('/home/user/fl_helpers') from validation import CrowdGuardClientValidation - Importing from a custom path requires explicit modification of
sys.path, which is not recommended and can lead to inconsistency across distributed system - Relying on custom paths or module locations outside the notebook directory, which adds complexity for the infrastructure in identifying and accessing the required user-defined modules
- Placing the modules in the same directory as the Jupyter Notebook streamlines the process, simplifies access, and eliminates the need to modify
sys.pathExample:workspace ├── crowd_guard.ipynb └── fl_helper ├── __init__.py └── validation.py
- Custom Path Dependencies: A user-defined module located
- User-defined modules must be placed in the same directory as the Jupyter Notebook to enable the infrastructure to correctly locate and copy these modules into the
-
Restrictions on User-defined imports
- User defined code should not modify the
sys.pathto enable python to find the scripts to import. For e.g.
CrowdGuard.ipynb (Jupyter notebook for Workflow API experiment)workspace ├── crowd_guard.ipynb └── helper └── validation.py
Recommended_Usage... ... sys.path.append('./helper') from validation import CrowdGuardClientValidation ...from helper.validation import CrowdGuardClientValidation
- User defined code should not modify the
-
User-defined imports should be self-contained:
-
User defined code should not import other user-defined code from different python scripts. For e.g.
utils.py (contains additional helper functions)
def calculate_accuracy(predictions, labels): correct = (predications == labels).sum() return correct / len(labels)validation.py (contains helper functions)
from utils import calculate_accuracy class CrowdGuardClientValidation: def validate_models(global_model, models, own_client_index, local_data, device): ... accuracy = calculate_accuracy(predictions, labels)
-
I've run into the exact same issue and I also think that this functionality would be quiet handy. Especially in bigger projects.
Also I asked about this in another thread a few days a ago and the response was https://github.com/securefederatedai/openfl/issues/1565#issuecomment-2899848065 :
At present, importing user-defined modules from separate Python files is not supported and [...] We will [...] consider the possibility of supporting this functionality in a future release