flink icon indicating copy to clipboard operation
flink copied to clipboard

[FLINK-37722] Eliminate redundant field initialization of PojoSerializer

Open X-czh opened this issue 7 months ago • 3 comments

What is the purpose of the change

Perf: eliminate redundant field initialization of PojoSerializer.

Brief change log

Currently, PojoSerializer will first create a new POJO instance and initialize all the fields (in the createInstance() method), then deserialize and set fields. The field initialization within the createInstance() method is redundant, as all fields will be set later during deserialization anyway. We should eliminate it for better performance.

The TupleSerializer has already applied similar techique, see the usage of TupleSerializer#instantiateRaw.

Verifying this change

This change is already covered by existing tests.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

X-czh avatar Apr 27 '25 06:04 X-czh

CI report:

  • 0bef61571bcc4c02ab8c24a18aeab81d8fd6487f Azure: SUCCESS
Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

flinkbot avatar Apr 27 '25 07:04 flinkbot

@flinkbot run azure

X-czh avatar Apr 27 '25 11:04 X-czh

@JunRuiLee Could you help take a review?

X-czh avatar Apr 28 '25 02:04 X-czh