scheduledwf icon indicating copy to clipboard operation
scheduledwf copied to clipboard

Schedule Conductor workflow is a scheduler as a service that runs in the cloud with Netflix conductor embedded in it. It runs as an extension module of conductor.

Schedule Conductor Workflows (A Netflix Conductor Community Project)

Schedule Conductor workflow is a spring boot based (version 2.0.0+) scheduler as a service that runs in the cloud with Netflix conductor embedded in it. It can run as an extension module of conductor. This is one of community project of netflix conductor community.

Build Status GitHub Release Maven Central Coverage Status GitHub license GitHub Release Date

Builds

Module Build
scheduledwf-server Maven Central
scheduledwf-module Maven Central

Motivation

  • In digital space there are many use cases that are solved by running schedulers. Some of the common cases are:
    1. publish site map for an e-commerce website.
    2. refresh cache everyday at a fix time.
    3. send an email notification at a scheduled time, etc.
  • If your architecture is micro services based then you have only two options:
    1. either to add scheduling capability (like Quartz scheduler, Spring scheduler, etc) in the service that needs to schedule a task.
    2. or to setup a separate micro service, which will perform scheduling for required use cases. This service interact with other service for data needs through REST APIs.
  • In the end we have unnecessary mesh of schedulers. After certain point of time such kind of setup is completely out of control.
  • Solution to the problem is Schedule Conductor workflow:
    • This can be deployed in place of conductor server.
    • Embedded conductor provides service orchestration capability.
    • Extension module provides scheduling capability.
    • Conductor workflow with cron expression can be scheduled through REST API.
    • Scheduled job spawned internally starts a workflow at a scheduled time.
  • Schedule Conductor workflow is extendable to work with all persistence stores supported by conductor.

Quickly use able in PRODUCTION:

  • version 2.0.0+
    • This version is compatible to conductor-boot v3.3.0 requires Java 11.
    • Refer Getting started section below.
  • upto version 1.2.2
    • Scheduling module can be enabled with property conductor.additional.modules=io.github.jas34.scheduledwf.config.ScheduledWfServerModule
    • Deploy scheduledwf-server instead of conductor-server.
    • Schedule Conductor is compatible with Java 8 and has embedded Conductor v2.30.4

You are done!

Architecture

High Level Architecture

Scheduled Conductor

API
  • Expose REST API interface for scheduling a workflow with metadata definition and cron expression (Scheduled Workflow Metadata Management)
  • Expose REST API interface for managing running schedulers (Scheduler Management)
  • Read more about in section Scheduling And Managing A Workflow.
SERVICE
  • Consists of manager to spawn a scheduler process.
  • One stop to manage complete life cycle of schedulers.
  • Read details in section Component Details.
STORE
  • Currently implemented for MYSQL.
  • Can be extended to other persistence stores offered by conductor.
  • Get more details on DAO from section Persistence Layer.

Getting started

Running scheduled workflow as a server:
  • Download jar from maven central scheduledwf-server
  • Alternatively:
    • you can fork a branch from master
    • run mvn --settings settings.xml -P bintray clean install
  • Executable jar can be found at scheduledwf/scheduledwf-server/target/scheduledwf-server-{version}.jar
  • start server with command:
    • upto version 1.2.2
      • java -jar scheduledwf-server-{version}.jar [PATH TO PROPERTY FILE] [log4j.properties file path]
    • version 2.0.0+
      • if running with default classpath property file application.properties
        • java -jar scheduledwf-server-{version}.jar
      • if running with external property file:
        • java -DCONDUCTOR_CONFIG_FILE={properties_file_path} -jar scheduledwf-server-{version}.jar
Running scheduled workflow as a module
  • If you are already running conductor server forked from Netflix conductor then you can use scheduled workflow as an additional dependency.
  • Add the following dependency in:
    • maven build:
      dependency>
          <groupId>io.github.jas34</groupId>
          <artifactId>scheduledwf-module</artifactId>
          <version>${scheduledwf-version}</version>
      </dependency>
      
    • gradle build:
      ompile 'io.github.jas34:scheduledwf-module:${scheduledwf-version}'
      

Scheduling And Managing A Workflow

  • REST operations for scheduling can be accessed on the conductor swagger at http://{host}:{port}
  • Example: Let us schedule sample workflow which checks health of conductor server every 1 minute. Sample definitions are:
    1. check-conductor-health-task def
    2. check-conductor-health-workflow def
    3. check-conductor-health-schedule def
      (Tip: You can use Cron Maker to generate cron expression.)
    • From swagger useScheduled Workflow Metadata Management
      • Upto version 1.2.2
        • POST /scheduling/metadata/scheduleWf: Schedule new workflow
        • GET /scheduling/metadata/scheduleWf: Get scheduling metadata of scheduled workflows.
        • GET /scheduling/metadata/scheduleWf/{name}: Get scheduling metadata of scheduled workflows by workflow name.
        • PUT /scheduling/metadata/scheduleWf/{name}: Update the status of scheduled workflow metadata.
      • Version 2.0.0+
        • POST /api/scheduling/metadata/scheduleWf: Schedule new workflow
        • GET /api/scheduling/metadata/scheduleWf: Get scheduling metadata of scheduled workflows.
        • GET /api/scheduling/metadata/scheduleWf/{name}: Get scheduling metadata of scheduled workflows by workflow name.
        • PUT /api/scheduling/metadata/scheduleWf/{name}: Update the status of scheduled workflow metadata.
    • From swagger useScheduler Management:
      • to search about schedule manager running on different nodes of cluster.
      • to search about scheduled jobs based upon scheduling metadata.
      • to search about different runs of scheduled jobs at scheduled time. The detailed data returned by IndexScheduledWfDAO
      • Glimpse of workflow scheduling. caption

Runtime Model

Scheduled Conductor

Component Details

Scheduler Manager

  • This component is meant to manage lifecycle of scheduled workflow.
  • It takes lifecycle state decision for a scheduler with the help of scheduled jobs registry.
  • It schedules the schedulers through scheduling assistant.
  • It index the scheduling information through IndexScheduledWfDAO interface.
    Scheduled Jobs Registry
    • This registry act a single source of truth to know whether a particular workflow is required to be scheduled, paused or deleted.
    • It can be customized by implementing ScheduledProcessRegistry interface.
    • The default implementation can be found here io/github.jas34.scheduledwf.execution.DefaultScheduledProcessRegistry.
    • It reads scheduled workflow details through ScheduledWfExecutionDAO.

Scheduling Assistant

  • This is an abstract layer for job scheduling.
  • It comes with default implementation of DefaultWorkflowSchedulingAssistant.
  • It:
    1. create jobs.
    2. pause jobs.
    3. delete scheduled jobs.
  • It contains factory WorkflowSchedulerFactory that returns an instance of WorkflowScheduler interface.
  • The assistant can be customized with the implementation of WorkflowSchedulingAssistant interface.
    WorkflowSchedulerFactory
    • This is core abstraction to define Scheduled process of your choice. (WorkflowSchedulerFactory<T extends ScheduledProcess>)
    • ScheduledProcess is one of the granular entity that is expected to have reference to scheduled process/thread. Currently implemented as CronBasedScheduledProcess.
    • The default implementation can be found here DefaultWorkflowSchedulerFactory.
    • Currently, returns CronBasedWorkflowScheduler.

Cron Based Workflow Scheduler (Jobs Scheduling)

  • The scheduling capability is completely customizable by implementing WorkflowScheduler interface and by returning applicable instance from WorkflowSchedulerFactory.
  • The default behaviour is to enable CronBasedWorkflowScheduler.
  • This is composed of wisp scheduler. Wisp provides in memory scheduling capabilities. As a result, CronBasedWorkflowScheduler also schedules in memory schedulers using wisp.
  • Each scheduled job is provided an instance of Runnable task through ScheduledTaskProvider interface.
    ScheduledTaskProvider
    • The default implementation of task provider is DefaultScheduledTaskProvider.
    • It creates fully flexible and customizable task for job before scheduling through getTask().
    • It performs indexing of workflow start executions through callback to IndexExecutionDataCallback.
    • It uses LockingService to prevent concurrent execution of same job on more than one server at scheduled time.

Locking Service

  • LockingService is composed of LockProvider available in conductor.
  • It contains a ExecutionPermitter:
    1. to get execution permit for a fix period of time through boolean issue(ScheduledTaskDef taskDef)
    2. to return back permit after use through void giveBack(ScheduledTaskDef taskDef).
    Execution Permitter
    • It consists of PermitDAO to persist the Permit for a fix period in the persistence store used by LockProvider.
    • This is currently supported for conductor lock mode:
      1. local_only
      2. redis
    • For any other type of data store, one can implement PermitDAO.

Persistence Layer

  • The persistence layer has been designed in a way to be completely aligned with persistence architecture of conductor.
  • This will be automatically enabled with conductor property db=MYSQL(Upto version 1.2.2) or conductor.db.type=mysql(Version 2.0.0+).
  • Currently it has following DAO interfaces:
    ScheduledWfMetadataDAO
    • This is used to persist scheduling metadata definitions of workflow through swagger operations under Scheduled Workflow Metadata Management
    • This has been currently implemented for MYSQL only. One can implement ScheduledWfMetadataDAO for any other type of persistence store.
    ScheduledWfExecutionDAO
    • This is used to persist scheduling reference against scheduled workflow.
    • Currently implemented to operate in memory and we do not see any need to move to any other persistence layer.
    IndexScheduledWfDAO
    • This is used to index each run of a scheduled job with many additional details like:
      • status of last execution
      • last execution time
      • next execution time
      • node on which execution has happened, etc.
    • This has been currently implemented for MYSQL only. One can implement IndexScheduledWfDAO for Elastic Search persistence store.

Get Support

  • This project is maintained by @jas34 and @sudshan as an open source application. Use github issue tracking for filing issues, ideas or support requests.
  • We have a wide road map to add many features to this service through various customizable hooks described above.
  • In case customization is an immediate for you, feel free to open an issue in github. We will consider that as a priority request.

Contributions

  • Whether it is a small documentation correction, bug fix or new features, contributions are highly appreciated.
  • Discuss new features and explore ideas with us through github issues.

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.