ozone icon indicating copy to clipboard operation
ozone copied to clipboard

HDDS-10811. Reduce UTF8 string encoding by caching encoding result

Open xichen01 opened this issue 1 year ago • 6 comments

What changes were proposed in this pull request?

Reduce com/google/protobuf/ByteString.copyFromUtf8 when encode a protobuf Object.

  • For a String type protobuf field, if you use the setXXX method (such as builder.setHostName), the protobuf will call ByteString.copyFromUtf8 convert UTF8 String to ByteString.
  • For some UTF8 String with fixed content, we can cache its ByteString value, and use setXXXBytes to avoid calling ByteString.copyFromUtf8

Before https://issues.apache.org/jira/secure/attachment/13068731/om-create-key-alloc-profiler-setxxxString.html

After The com/google/protobuf/ByteString.copyFromUtf8 in the DatanodeDetail disappear. https://issues.apache.org/jira/secure/attachment/13068732/om-create-key-alloc-profiler-setxxbytes.html

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10811

How was this patch tested?

Existing Test.

xichen01 avatar May 08 '24 11:05 xichen01

@xichen01 , thanks a lot for working on this! How about add a StringWithByteString class as below? Than, it won't have any additional cache lookups.

diff --git a/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/HddsUtils.java b/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/HddsUtils.java
index 26f6be44e5..d62af61d2a 100644

@szetszwo Thanks for the suggestion, but I noticed that in OM, there are quite a few instances of DatanodeDetail, and this implementation adds more memory consumption to the member variable than before.

The current my PR idea is not to add extra memory consumption, so it uses a global static Map to cache the ByteString.

xichen01 avatar May 09 '24 12:05 xichen01

@xichen01 , do you have comparison before and after this change? What is the memory saving? Is it in MBs or GBs?

If there are many DatanodeDetails instances , we probably should cache DatanodeDetails instances instead of the ByteString inside the DatanodeDetails.

szetszwo avatar May 09 '24 14:05 szetszwo

@szetszwo Thanks for your reply

If there are many DatanodeDetails instances , we probably should cache DatanodeDetails instances instead of the ByteString inside the DatanodeDetails.

  • But DatanodeDetails is not an immutable object, so it is not possible to simply cache DatanodeDetails, which would require considerable modification, and possibly have some compatibility issues.

  • Moreover, DatanodeDetails not only consumes memory space for its own objects, but also consumes a lot of resources during the encode/decode of DatanodeDetails, this PR is to reduce the consumption of DatanodeDetails encode.

@xichen01 , do you have comparison before and after this change? What is the memory saving? Is it in MBs or GBs?

Here's an alloc flame graph that shows the effect of the optimization

Before https://issues.apache.org/jira/secure/attachment/13068731/om-create-key-alloc-profiler-setxxxString.html After The com/google/protobuf/ByteString.copyFromUtf8 in the DatanodeDetail disappear. https://issues.apache.org/jira/secure/attachment/13068732/om-create-key-alloc-profiler-setxxbytes.html

Of course, this PR is only a small improvement, and it would be a big improvement to be able to cache the DatanodeDetails, which may require more work.

xichen01 avatar May 09 '24 15:05 xichen01

... DatanodeDetails is not an immutable object, ...

SCM and other code probably don't needs different DatanodeDetails instances for the same datanode. I agree that it is not easy to fix. So, our change, although not fixing it, should not make it harder to fix it in the future.

... but also consumes a lot of resources during the encode/decode of DatanodeDetails, this PR is to reduce the consumption of DatanodeDetails encode.

This PR also adds Guava Cache with some hard coded size (50,000). Guava Cache is a complicated data structure. It may run threads internally. It needs cache lookup. It will create more problems. It does not worth to save the memory space for 50,000 short ByteString.

That's why I suggest to add StringWithByteString. Please consider it.

szetszwo avatar May 09 '24 16:05 szetszwo

This PR also adds Guava Cache with some hard coded size (50,000). ....

In fact, the reason I use Guava Cache is to prevent accidents from occurring that would result in too many keys in the cache. Guava Cache makes it easy to control how much to cache.

If we can guarantee the number of keys to be cached ( hostname, ip, uuid, ... (should not be too much.)), I think we can use a simpler Map Cache structure such as ConcurrentHashMap

That's why I suggest to add StringWithByteString. Please consider it.

StringWithByteString still needs to repeatedly calculate ByteString.copyFromUtf8, which moves the call of ByteString.copyFromUtf8 from inside protobuf to StringWithByteString (Some DatanodeDetails have short lifecycles, may only be accessed once) But for DatanodeDetail in Pipeline Cache, it will be effective solution

How do you think?

xichen01 avatar May 09 '24 17:05 xichen01

StringWithByteString still needs to repeatedly calculate ByteString.copyFromUtf8, which moves the call of ByteString.copyFromUtf8 from inside protobuf to StringWithByteString (Some DatanodeDetails have short lifecycles, may only be accessed once)

Only the DatanodeDetails created from the Builder needs converting. The DatanodeDetails created from another DatanodeDetails does not (updated the code in my previous comment). The short lived DatanodeDetails probably are not created by the Builder. No?

BTW, we could also change Builder to pass StringWithByteString when converting proto-to-DatanodeDetails. Then, it will already have the ByteString.

szetszwo avatar May 09 '24 17:05 szetszwo

@szetszwo Thanks for your new comments and patch, I have applied the patch.

xichen01 avatar May 12 '24 12:05 xichen01