flink icon indicating copy to clipboard operation
flink copied to clipboard

[FLINK-33611] [flink-protobuf] Support Large Protobuf Schemas

Open sharath1709 opened this issue 2 years ago • 9 comments

What is the purpose of the change

Background

Flink serializes and deserializes protobuf format data by calling the decode or encode method in GeneratedProtoToRow_XXX.java generated by codegen to parse byte[] data into Protobuf Java objects. FLINK-32650 has introduced the ability to split the generated code to improve the performance for large Protobuf schemas. However, this is still not sufficient to support some larger protobuf schemas as the generated code exceeds the java constant pool size limit and we can see errors like "Too many constants" when trying to compile the generated code. We have run into these issues while doing a POC with some of our real production schemas at UBER

Solution

Split the last code segment also only when the size exceeds split threshold limit. Currently, the last segment of the generated code is always being split which can lead to too many split methods and thus exceed the constant pool size limit

Implementation

Add check for split size before splitting last segment

Brief change log

  • [FLINK-33611] [flink-protobuf] Add VeryBigProtobuf testcases
    
  • [FLINK-33611] [flink-protobuf] Split last segment only when size exceeds split threshold limit in serializer
    
  • [FLINK-33611] [flink-protobuf] Split last segment only when size exceeds split threshold limit in deserializer
    

Verifying this change

This change is already covered by existing tests, such as existing Unit Tests and Integration tests for Protobuf format Including the BigProtoBufCodeSpiltterTest. Additionally, I have added a new test with very big protobuf schemas which would fail without the optimizations made in this pull request

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): No
  • The public API, i.e., is any changed class annotated with @Public(Evolving): No
  • The serializers: No
  • The runtime per-record code paths (performance sensitive): No
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: No
  • The S3 file system connector: No

Documentation

  • Does this pull request introduce a new feature? Yes
  • If yes, how is the feature documented? Not Applicable as the feature is fully transparent to users

sharath1709 avatar Dec 15 '23 22:12 sharath1709

CI report:

  • cc57c9c6c55262cd4d6862cce5a1cb76151aba75 Azure: FAILURE
Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

flinkbot avatar Dec 15 '23 22:12 flinkbot

@libenchao Can you please review this PR? I see you're one of the main reviewer for this module in the past

sharath1709 avatar Dec 20 '23 23:12 sharath1709

@sharath1709 Thanks for your contribution. Per the community's convention, we need to reach an agreement on the issue before we go to the PR review. I'll comment on the corresponding Jira, and let's get back here after we reach the agreement.

libenchao avatar Dec 21 '23 12:12 libenchao

@libenchao Thanks for the quick response. Sure, let's try to reach an agreement on the JIRA first

sharath1709 avatar Dec 21 '23 19:12 sharath1709

@sharath1709 sorry for the delay, I haven't checked the code/design yet, but can you add a test case that fails without this improvement?

libenchao avatar Jan 02 '24 11:01 libenchao

@flinkbot run azure

sharath1709 avatar Jan 08 '24 22:01 sharath1709

@flinkbot run azure

sharath1709 avatar Jan 09 '24 00:01 sharath1709

@flinkbot run azure

sharath1709 avatar Jan 09 '24 21:01 sharath1709

@libenchao Could you review this PR again, it is updated as per our discussion on the JIRA ticket and I also added a test case that fails without the changes in this PR. Also, the failing python test is unrelated to this diff and seems to be flaky.

sharath1709 avatar Jan 10 '24 21:01 sharath1709

@libenchao Gentle ping on the review

sharath1709 avatar Jan 29 '24 20:01 sharath1709