dolphinscheduler icon indicating copy to clipboard operation
dolphinscheduler copied to clipboard

[Feature][API] Add metadata changed event in api module when metadata changed

Open ruanwenjun opened this issue 8 months ago • 12 comments

Search before asking

  • [x] I had searched in the issues and found no similar feature requirement.

Description

We need to do some work once metadata changed in DS, e.g. lineage parser when workflow changed/task changed、workflow online notification、project delete notification...

It's better to expose these events in ds API module.

We can use EventBus to store these events, the default implementation can be memory, can only print these in log, user can easily add their event consumer.

Use case

No response

Related issues

No response

Are you willing to submit a PR?

  • [x] Yes I am willing to submit a PR!

Code of Conduct

ruanwenjun avatar Apr 30 '25 03:04 ruanwenjun

Hey, I'd like to take on this task. Has it already been started

yingh0ng avatar May 27 '25 02:05 yingh0ng

Hey, I'd like to take on this task. Has it already been started

Not started, it's better to provide a design.

ruanwenjun avatar May 27 '25 14:05 ruanwenjun

Hey, I'd like to take on this task. Has it already been started

Not started, it's better to provide a design.

Alright, I'll share my design later.

yingh0ng avatar May 28 '25 01:05 yingh0ng

@ruanwenjun Hello, I have a design question. Why do ITaskExecutor and IWorkflowExecutionRunnable each require their own EventBus?

In systems I've observed, a component typically uses a single event bus. This approach is simpler to manage and can leverage message sharding for scalability when performance needs arise.

Could you advise on which design is more appropriate in this case?

yingh0ng avatar Jun 04 '25 07:06 yingh0ng

@ruanwenjun Hello, I have a design question. Why do ITaskExecutor and IWorkflowExecutionRunnable each require their own EventBus?

In systems I've observed, a component typically uses a single event bus. This approach is simpler to manage and can leverage message sharding for scalability when performance needs arise.

Could you advise on which design is more appropriate in this case?

Since TaskExecutor is a component at worker, IWorkflowExecutionRunnable and ITaskExecutionRunnable are components at master.

ruanwenjun avatar Jun 05 '25 01:06 ruanwenjun

@ruanwenjun Hello, I have a design question. Why do ITaskExecutor and IWorkflowExecutionRunnable each require their own EventBus? In systems I've observed, a component typically uses a single event bus. This approach is simpler to manage and can leverage message sharding for scalability when performance needs arise. Could you advise on which design is more appropriate in this case?

Since TaskExecutor is a component at worker, IWorkflowExecutionRunnable and ITaskExecutionRunnable are components at master.

I am sorry. I mean, why every WorkflowExecuteContext has a WorkflowEventBus instance instead of share one WorkflowEventBus instance.

yingh0ng avatar Jun 05 '25 02:06 yingh0ng

Each Job/Workflow has its own event bus is common design at these kind of system, it make event management easy, at yarn each RMApp also has its own event queue. If all workflow share one EventBus, we still need to do extra thing to dispatch the event to different consumer.

ruanwenjun avatar Jun 06 '25 01:06 ruanwenjun

Each Job/Workflow has its own event bus is common design at these kind of system, it make event management easy, at yarn each RMApp also has its own event queue. If all workflow share one EventBus, we still need to do extra thing to dispatch the event to different consumer.

I did some learning about this. The independent event bus like you said is a better design in DS. Thank`s your suggestion.

yingh0ng avatar Jun 09 '25 09:06 yingh0ng

@ruanwenjun Hey! I have make a basic design in the multiple-eventbus-instance way. But i found that the lifecycle of metadata's event are very short, the events are haven't state change and they are haven't a instance to keep the eventbus instance like the WorkflowInstance and TaskInstance.

If i use the multiple-eventbus-instance to save the event, i have to create an eventbus instance for every events, and every eventbus instance will only accept one event and closed like the project-delete-event. Should i use this design mode?

Can you give some suggestion for me, thanks!

yingh0ng avatar Jun 12 '25 03:06 yingh0ng

The metadata event belongs to workflow/project/datasource, you can only add one event bus in API module, once the workflow online/offline/update, project create/delete, datasource create/delete/update you can produce the event.

ruanwenjun avatar Jun 13 '25 03:06 ruanwenjun

The metadata event belongs to workflow/project/datasource, you can only add one event bus in API module, once the workflow online/offline/update, project create/delete, datasource create/delete/update you can produce the event.

Got it, thanks!

yingh0ng avatar Jun 16 '25 01:06 yingh0ng

@ruanwenjun This is my design, PTAL.

Design Detail

Image

AbstractMetadataEvent

Extends the IEvent. Responsible for define the metadata event.

The abstrace method is:

public abstract IMetadataEventType getEventType();

Subclasses:

  • AbstractWorkflowMetadataEvent. Holding the workflowDefinition. Subclass:
    • WorkflowDefinitionUpdateEvent
    • WorkflowDefinitionReleaseEvent
    • WorkflowDefinitionOfflineEvent
  • AbstractDatasourceMetadataEventt. Holding the Datasource. Subclass:
    • DatasourceCreateEvent
    • DatasourceUpdateEvent
    • DatasourceDeleteEvent
  • AbstractProjectMetadataEventt. Holding the Project. Subclass:
    • ProjectCreateEvent
    • ProjectDeleteEvent

IMetadataEventType

The interface for define the event type.

I will only add some common event type this time and their handlers will only print the log.

Subclasses(enum):

  • WorkflowMetadataEventType
    • WORKFLOW_DEFINITION_UPDATE
    • WORKFLOW_DEFINITION_RELEASE
    • WORKFLOW_DEFINITION_OFFLINE
  • DatasourceMetadataEventType
    • DATASOURCE_CREATE
    • DATASOURCE_UPDATE
    • DATASOURCE_DELETE
  • ProjectMetadataEventType
    • PROJECT_CREATE
    • PROJECT_DELETE

MetadataEventBus

Extends the AbstractDelayEventBus. Responsible for accept and save the AbstractMetadataEvent .

IMetadataEventHandler

Responsible for handle the AbstractMetadataEvent. The fireworker will dispatch events to it by the event type match. Users can add their implementation to handle the event.

The default implementations:

  • Workflow
    • WorkflowDefinitionUpdateHandler
    • WorkflowDefinitionReleaseHandler
    • WorkflowDefinitionOfflineHandler
  • Datasource
    • DatasourceCreateHandler
    • DatasourceUpdateHandler
    • DatasourceDeleteHandler
  • Project
    • ProjectCreateHandler
    • ProjectDeleteHandler

All above handlers are only print the log in the default implementation.

MetadataEventBusFireWorker

It is a daemon thread that responsible for fire the events from eventBus and dipatch events to handlers by event type match.

yingh0ng avatar Jun 18 '25 01:06 yingh0ng