dolphinscheduler icon indicating copy to clipboard operation
dolphinscheduler copied to clipboard

[DSIP-77][task-api] Add cycle dependency type

Open fengjian1129 opened this issue 1 year ago • 11 comments

Search before asking

  • [X] I had searched in the DSIP and found no similar DSIP.

Motivation

DolphinScheduler currently lacks dependency types scheduled at the same cycle level, such as the current thisMonth type, which mainly detects instances that have been successfully executed every day of the current month before they can continue to execute. In most data generation, two tasks scheduled at the same month level are also required, and only instances that have been successfully executed once in the current month need to be detected. However, other scheduling dependency types also have similar issues, such as lastMonth,lastWeek,last7Days, last3Days, etc. They are all task dependency types that detect successful instances within a certain time range. We need a task dependency type that only detects one successful instance within the execution cycle.

Design Detail

Add cycle dependency type , such as last1Date last2Date ... lastNdate lastNmonthDate lastNWeek,etc. only detects one successful instance within the execution cycle.

Compatibility, Deprecation, and Migration Plan

No response

Test Plan

No response

Code of Conduct

fengjian1129 avatar Oct 18 '24 04:10 fengjian1129

What does the UI look like when lastNdate is selected?

davidzollo avatar Oct 18 '24 05:10 davidzollo

What does the UI look like when lastNdate is selected?

Now I want to implement some fixed types first because I can't write the front-end interface. I can organize the documents for the newly added fixed scheduling types and submit them to the community

fengjian1129 avatar Oct 18 '24 06:10 fengjian1129

I think this proposal is very good. I'm +1 on this.

Now I want to implement some fixed types first because I can't write the front-end interface. I can organize the documents for the newly added fixed scheduling types and submit them to the community

Since this kind of modification is not very huge. Only when the front-end and back-end as well as the documents and tests are complete will the merger be allowed.

SbloodyS avatar Oct 18 '24 07:10 SbloodyS

+1

davidzollo avatar Oct 18 '24 07:10 davidzollo

I think this proposal is very good. I'm +1 on this.

Now I want to implement some fixed types first because I can't write the front-end interface. I can organize the documents for the newly added fixed scheduling types and submit them to the community

Since this kind of modification is not very huge. Only when the front-end and back-end as well as the documents and tests are complete will the merger be allowed.

May I ask if there is any documentation for compiling backend code and frontend code? I want to try debugging, I only submitted backend code before

fengjian1129 avatar Oct 18 '24 08:10 fengjian1129

May I ask if there is any documentation for compiling backend code and frontend code? I want to try debugging, I only submitted backend code before

You can take a look at https://dolphinscheduler.apache.org/zh-cn/docs/dev/contribute/development-environment-setup

SbloodyS avatar Oct 18 '24 08:10 SbloodyS

May I ask if there is any documentation for compiling backend code and frontend code? I want to try debugging, I only submitted backend code before

You can take a look at https://dolphinscheduler.apache.org/zh-cn/docs/dev/contribute/development-environment-setup

thx , bro

fengjian1129 avatar Oct 18 '24 08:10 fengjian1129

We also encountered this issue in practical use. It can be resolved by introducing a time period type, such as anyDayLastMonth, ensuring that downstream dependent tasks can function properly as long as the workflow instance was successfully executed at least once during that period. I am willing to submit a PR @SbloodyS Please check if the method is feasible Image Add several types:"anyTimeToday,anyTimeLast1Days,anyTimeLast2Days,anyTimeLast3Days,anyTimeLast7Days,anyTimeThisWeek,anyTimeLastWeek,anyTimeThisMonth,anyTimeLastMonth" Add a parameter in calculateResultForTasks to determine if the task execution result is "any time in a period of time". If any workflow instance is successful during the time period, it will return a success status

private DependResult calculateResultForTasks(DependentItem dependentItem,
                                             List<DateInterval> dateIntervals,
                                             int testFlag) {

    DependResult result = DependResult.FAILED;
    for (DateInterval dateInterval : dateIntervals) {
        ProcessInstance processInstance = findLastProcessInterval(dependentItem.getDefinitionCode(),
                dateInterval, testFlag);
        if (processInstance == null) {
            return DependResult.WAITING;
        }
        // need to check workflow for updates, so get all task and check the task state
        if (dependentItem.getDepTaskCode() == Constants.DEPENDENT_WORKFLOW_CODE) {
            result = dependResultByProcessInstance(processInstance);
        } else if (dependentItem.getDepTaskCode() == Constants.DEPENDENT_ALL_TASK_CODE) {
            result = dependResultByAllTaskOfProcessInstance(processInstance, testFlag);
        } else {
            result = dependResultBySingleTaskInstance(processInstance, dependentItem.getDepTaskCode(), testFlag);
        }
        if (result != DependResult.SUCCESS) {
            break;
        }
    }
    return result;
}

dill21yu avatar May 26 '25 10:05 dill21yu

@SbloodyS @ruanwenjun Help to see if it is feasible to make changes to the dependency cycle type?

dill21yu avatar Jun 11 '25 06:06 dill21yu

@SbloodyS @ruanwenjun Help to see if it is feasible to make changes to the dependency cycle type?

+1 to the feature. We may need to use a dynamical expression e.g. (T-1,h), (T-1, d) to expression this.

ruanwenjun avatar Jun 13 '25 06:06 ruanwenjun

Add several types:"anyTimeToday,anyTimeLast1Days,anyTimeLast2Days,anyTimeLast3Days,anyTimeLast7Days,anyTimeThisWeek,anyTimeLastWeek,anyTimeThisMonth,anyTimeLastMonth" Add a parameter in calculateResultForTasks to determine if the task execution result is "any time in a period of time". If any workflow instance is successful during the time period, it will return a success status

At present, we have two different types of dependency cycles.

  1. Individual cycle dependence, such as one hour, last day, etc...
  2. Continuous time dependence, for example, last month means a month in a row.

Adding anytime types directly to the status quo will lead to confusion among users. So I suggest that the naming methods of these types be combed again.

  1. Individual cycle dependence 1.1 currentHour stay as it is. 1.2 Using lastNHours to represent any hours and use expression to parse it. 1.3 lastWeek stay as it is. 1.4 lastMonday stay as it is. 1.5 lastTuesday stay as it is. 1.6 lastWednesday stay as it is. 1.7 lastThursday stay as it is. 1.8 lastFriday stay as it is. 1.9 lastSaturday stay as it is. 1.10 lastSunday stay as it is. 1.11 thisMonth newly added. 1.12 Using lastNMonth to represent any month and use expression to parse it.
  2. Continuous time dependence 2.1 thisMonthPeriod to represent a continuous month. 2.2 last1MonthPeriod to represent a specified continuous month and use expression to parse it.

This is just my initial idea. If you have a better idea, please feel free to point it out. @dill21yu

SbloodyS avatar Jun 14 '25 09:06 SbloodyS

Add several types:"anyTimeToday,anyTimeLast1Days,anyTimeLast2Days,anyTimeLast3Days,anyTimeLast7Days,anyTimeThisWeek,anyTimeLastWeek,anyTimeThisMonth,anyTimeLastMonth" Add a parameter in calculateResultForTasks to determine if the task execution result is "any time in a period of time". If any workflow instance is successful during the time period, it will return a success status

At present, we have two different types of dependency cycles.

  1. Individual cycle dependence, such as one hour, last day, etc...
  2. Continuous time dependence, for example, last month means a month in a row.

Adding anytime types directly to the status quo will lead to confusion among users. So I suggest that the naming methods of these types be combed again.

  1. Individual cycle dependence 1.1 currentHour stay as it is. 1.2 Using lastNHours to represent any hours and use expression to parse it. 1.3 lastWeek stay as it is. 1.4 lastMonday stay as it is. 1.5 lastTuesday stay as it is. 1.6 lastWednesday stay as it is. 1.7 lastThursday stay as it is. 1.8 lastFriday stay as it is. 1.9 lastSaturday stay as it is. 1.10 lastSunday stay as it is. 1.11 thisMonth newly added. 1.12 Using lastNMonth to represent any month and use expression to parse it.
  2. Continuous time dependence 2.1 thisMonthPeriod to represent a continuous month. 2.2 last1MonthPeriod to represent a specified continuous month and use expression to parse it.

This is just my initial idea. If you have a better idea, please feel free to point it out. @dill21yu

I agree with your idea. The new 'thisMonthPeriod' means that the dependency condition only needs to satisfy any successful instance within this month. Besides 'thisMonthPeriod' and 'last1MonthPeriod', do we also need to add types like 'last1WeekPeriod' or others? Can this issue be assigned to me to work on? @SbloodyS

dill21yu avatar Jun 23 '25 07:06 dill21yu

The new 'thisMonthPeriod' means that the dependency condition only needs to satisfy any successful instance within this month.

It's not. thisMonthPeriod to represent a continuous month. It means that every day of this month is consistent with the current dependence.

Besides 'thisMonthPeriod' and 'last1MonthPeriod', do we also need to add types like 'last1WeekPeriod' or others?

Yes. We also need lastNMonthPeriod.

Can this issue be assigned to me to work on

Sure. We need to discuss it clearly first before PR.

SbloodyS avatar Jun 25 '25 06:06 SbloodyS

Could you let me know if this approach looks good? @SbloodyS @ruanwenjun add a checkMode field to differentiate between “ANY_SUCCESS” and “ALL_SUCCESS” requirements while maintaining backward compatibility.

Behavior Examples:

dateValue | checkMode | Meaning | Legacy Equivalent

last7Days | ANY_SUCCESS | Any single success in 7 days (default) | Original anyTimeLast7Days

last7Days | ALL_SUCCESS | Requires success every day in 7d | N/A (new)

last1Month | ANY_SUCCESS | Any success in the last month | Original lastMonth

Advantages: Full backward compatibility: Old configs default to ANY_SUCCESS. Clear separation: type = time window, checkMode = validation logic. Scalable: Future time units (e.g., lastNHours) need no duplicate types.

Why no _Period suffix types? Avoids redundancy (e.g., last7Days + last7DaysPeriod). Default ANY_SUCCESS preserves legacy behavior. Thanks for the discussion!

dill21yu avatar Sep 11 '25 10:09 dill21yu