samza icon indicating copy to clipboard operation
samza copied to clipboard

SAMZA-2168: Remove redundant SystemAdmin creation in ApplicationMaster

Open shanthoosh opened this issue 5 years ago • 5 comments

Samza ApplicationMaster is a process responsible for scheduling, orchestrating and managing the lifecycle of containers of a samza job.

SystemAdmin abstraction is used in samza to validate, create and fetch the metadata of input and metadata streams of the samza job. Creating a SystemAdmin instance for a system is an expensive operation which entails creating the connection with the broker and setting up resources.

Currently, SystemAdmin for a system is created multiple times across the different components of the samza ApplicationMaster. This duplicate SystemAdmin creation happens as a part of the startup sequence of ApplicationMaster and unnecessarily increases the startup time of ApplicationMaster.

shanthoosh avatar Apr 18 '19 19:04 shanthoosh

@xinyuiscool

Please take a look when you have a chance.

shanthoosh avatar Apr 18 '19 19:04 shanthoosh

@xinyuiscool

  1. I tested this patch with the hello-samza test jobs in open-source.
  2. Verified that this patch works fine with samza-yarn and a beam job in LinkedIn.

For the large stateful jobs which consumes a lot of input topics from different systems, I verified that this patch reduces the startup time of ApplicationMaster from 1.5 minutes to 20 seconds.

shanthoosh avatar Apr 18 '19 19:04 shanthoosh

Is this PR still needed?

mynameborat avatar Jul 08 '19 20:07 mynameborat

This patch reduces the number of SystemAdmin instances created in ApplicationMaster startup control flow. This reduces the ApplicationMaster startup time.

Let me fix the merge conflicts and update the patch.

shanthoosh avatar Jul 08 '19 23:07 shanthoosh

@shanthoosh FYI, is this PR still relevant?

prateekm avatar Oct 08 '19 17:10 prateekm