hudi icon indicating copy to clipboard operation
hudi copied to clipboard

[HUDI-4687] Avoid setAccessible which breaks strong encapsulation

Open codope opened this issue 3 years ago • 1 comments

JDK versions 16 or later enforce strong encapsulation and do now allow to invoke setAccessible on a field, especially when the isAccessible is false. More details in JEP 403. This PR attempts to address that for ObjectSizeCalculator. An immediate use case where this would be beneficial is integrating Hudi with other libraries, e.g. Trino, that are using higher versions of JDK.

We evaluated two other approaches:

  1. Get the object size base on the amount of byte serialized. However, this runs into error if the incoming object does not implement Serializable.
  2. Use Java's Instrumentation API. For this, we need to create an instrumentation agent that can be hooked to the JVM. If Hudi was a standalone project, then we could have taken but since Hudi is integrated into other projects as well, we need to invest some time to figure out how to hook our instrumentation agent into those running JVMs.

Change Logs

  • If fields are accessible, run the usual code.
  • Else, use Java Object Layout (JOL) to get object size.
  • Add test with different types of objects.

Impact

High. This PR makes changes in ObjectSizeEstimator, which is on the hot path.

Risk level: none | low | medium | high

High. Tests have been added to cover various types of objects, both serializable and no serializable.

Contributor's checklist

  • [ ] Read through contributor's guide
  • [ ] Change Logs and Impact were stated clearly
  • [ ] Adequate tests were added if applicable
  • [ ] CI passed

codope avatar Sep 12 '22 15:09 codope

cancelling all azure CI runs for now to investigate CI flakiness. will retrigger build once we are in stable state. sorry about the inconvenience.

nsivabalan avatar Sep 23 '22 16:09 nsivabalan

I'm a little worried about JOLs performance though, so would be great if we can write simple JMH based micro-benchmark for it and compare it against what we had before (we can do it after release branch cut).

Good point! I am going to take that up soon. HUDI-4943.

codope avatar Sep 28 '22 08:09 codope

CI report:

  • f4d7ad25b0513ac14d78d08b58c7bd0b4b0cf374 UNKNOWN
  • 1751ff401d075667a7acedde954a9b9c459ce7ef Azure: SUCCESS
Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

hudi-bot avatar Sep 28 '22 16:09 hudi-bot