bazel-buildfarm icon indicating copy to clipboard operation
bazel-buildfarm copied to clipboard

Use non-blocking UUID generation in CASFileCache

Open zpzjzj opened this issue 2 years ago • 6 comments

We deploy Buildfarm on k8s clusters in which case shard CAS may not provide persistent storage. Jobs on worker start with heavy fetch requests from unique remote storage, which got frequently blocked by UUID generation, this optimization can be short but effective.

I setup with single server and single worker on 96 core CPU node. ran with our monorepo with about 35k cpp files.

profile with arthas

profiler start -d 300
  1. default image

  2. with thread local image

ref: https://github.com/apache/openwhisk/issues/2747

zpzjzj avatar Jan 13 '23 02:01 zpzjzj

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

google-cla[bot] avatar Jan 13 '23 02:01 google-cla[bot]

Can you provide some context on the reason for this? What is the benefit to applying a non blocking uuid gen?

werkt avatar Jan 13 '23 02:01 werkt

Can you provide some context on the reason for this? What is the benefit to applying a non blocking uuid gen?

Do you really have enough write generation that this matters (i.e. dominates)?

werkt avatar Jan 13 '23 02:01 werkt

Good job profiling this. It's nice to see you got a performance boost as well. My only suggestion is to make the UUID fetching something like a Supplier. That way, we can use dependency injection and choose which UUID implementation we want.

I would be good to land this if we can preserve the existing behavior and make this optimization opt-in at some level in the code.

luxe avatar Jan 16 '23 06:01 luxe

Good job profiling this. It's nice to see you got a performance boost as well. My only suggestion is to make the UUID fetching something like a Supplier. That way, we can use dependency injection and choose which UUID implementation we want.

I would be good to land this if we can preserve the existing behavior and make this optimization opt-in at some level in the code.

Not totally sure we need a dependency injection - I'm fine switching our implementation to use this given the profiling, and I'd also like to see this applied to the uuid gen for operations

werkt avatar Jan 16 '23 15:01 werkt

@zpzjzj can you rebase and revalidate tests with this?

werkt avatar Feb 20 '23 05:02 werkt