fluid
fluid copied to clipboard
[FEATURES] Implement Self-Reporting Progress Feature to Enhance Fluid Data Operation Monitoring and Management
Background
In the Fluid project, data operations (dataprocess, dataload, datamigrate) are core functionalities, including data processing, preheating, and migration. To better monitor and manage these operations, a self-reporting progress feature needs to be implemented. This feature will present the work progress of data operations in their status.
Objectives
- Design a general mechanism to uniformly present progress in Fluid's data operation CRDs. This mechanism should be similar to Argo's Self-Reporting Progress, where users specify the progress updates in a file within the container, and Fluid's controller updates the status in the CRD.
- Implement a proof-of-concept solution using DataProcess.
Feature Requirements
-
Progress Reporting Mechanism:
- Each data operation task should be capable of generating progress reports during execution.
- The progress reports should follow an
N/Mformat, whereNis the completed amount of work andMis the total amount of work.
-
Environment Variable Configuration:
- Define an environment variable
FLUID_PROGRESS_FILE, which specifies the location of the progress report file.
- Define an environment variable
-
Progress Report File:
- The data operation task must periodically update the
FLUID_PROGRESS_FILEduring execution to report the current progress.
- The data operation task must periodically update the
-
Executor Reading Mechanism:
- The executor should periodically (e.g., every 3 seconds) check the
FLUID_PROGRESS_FILEto get the latest progress information.
- The executor should periodically (e.g., every 3 seconds) check the
-
Progress Annotation:
- Upon task initiation, the task's metadata should set an initial progress annotation, such as
fluid.io/data-progress: 0/100.
- Upon task initiation, the task's metadata should set an initial progress annotation, such as
-
Progress Update:
- If the
FLUID_PROGRESS_FILEis updated, the executor should update the task's annotation to reflect the latest progress.
- If the
-
Progress Display:
- The monitoring system should be able to read the task's annotations and display the real-time progress of each data operation task on the user interface.
Example Code
apiVersion: fluid.io/v1alpha1
kind: DataProcess
metadata:
name: train-flow-step1
spec:
dataset:
name: jfsdemo
namespace: default
processor:
metadata:
annotations:
fluid.io/data-progress: 0/100
script:
image: nginx
imageTag: latest
command: ["bash"]
script: |
for i in $(seq 1 10); do
sleep 10
echo "$(($i*10))/100" > $FLUID_PROGRESS_FILE
done
This example provides a basic framework.
- Design a self-reporting progress feature for Fluid and provide a design document following the Fluid Design Workflow
- Implement a proof-of-concept solution using DataProcess
- Add documentation and a demo to support the usage of self-reporting progress for DataProcess in Fluid