envoy-perf icon indicating copy to clipboard operation
envoy-perf copied to clipboard

[Salvo] Nighthawk is built every time we test an Envoy image

Open gyohuangxin opened this issue 3 years ago • 11 comments

With scavenging mode and binary mode, we build Nighthawk every time we test an Envoy image, which wasted much time for rebuilding if we have many images to be tested. https://github.com/envoyproxy/envoy-perf/blob/main/salvo/src/lib/benchmark/scavenging_benchmark.py#L75

gyohuangxin avatar Mar 02 '22 06:03 gyohuangxin

/assign @gyohuangxin

gyohuangxin avatar Mar 02 '22 06:03 gyohuangxin

gyohuangxin is not allowed to assign users.

:cat:

Caused by: a https://github.com/envoyproxy/envoy-perf/issues/129#issuecomment-1056372568 was created by @gyohuangxin.

see: more, trace.

@mum4k I'm looking at this issue, can you assign it to me? Thanks!

gyohuangxin avatar Mar 02 '22 06:03 gyohuangxin

Thank you @gyohuangxin, assigned.

mum4k avatar Mar 03 '22 04:03 mum4k

Hi @gyohuangxin , I think salvo clones and builds binary under different dirs every time make it unable to reuse the code and cache. I have a idea that we can generate dir name base on hash of salvo job configuration file to make sure same job manages to clone and build at same dir. What's your opinion on this?

xu1zhou avatar Jun 16 '22 02:06 xu1zhou

@xu1zhou Yes, you're correct. Nighthawk binary files will be built multiple times in different dirs. I think it's a good idea but it may be a huge change (just my personal guess, haven't verified it yet). Here is another idea which may change less files, we can name the nighthawk build dir based on the salvo job, then check if the nighthawk build dir exists before building Nighthawk.

I haven't much cycle on this issue now, please feel free to take this issue and raise PR. These two methods are both feasible, you can chose either.

gyohuangxin avatar Jun 16 '22 02:06 gyohuangxin

@gyohuangxin thanks for your suggestion!

xu1zhou avatar Jun 19 '22 00:06 xu1zhou

Some points I found after going through the code:

  • nighthawk calls file_ops.get_random_dir(home_dir) (https://github.com/envoyproxy/envoy-perf/blob/2d45a91915b8076da227371be4d6d1ee706a8670/salvo/src/lib/source_tree.py#L62-L65) to create ramdom build directory within /tmp/salvo. it's easy to create a fixed directory like /tmp/salvo/nighthawk to contain the source. But in current logic, _build_dir need to be a tempdir object with name attribute to be invoked in __repr__(self)(https://github.com/envoyproxy/envoy-perf/blob/2d45a91915b8076da227371be4d6d1ee706a8670/salvo/src/lib/source_tree.py#L69-L76) Problem is, function like os.mkdir() unenable to return an object like that.
  • Another workaround is to use an existing path contains nighthawk source code to replace an url. This would spare all clone operation from network, but for each bulid job, it works as before.

xu1zhou avatar Jun 20 '22 13:06 xu1zhou

Thanks for looking into this @xu1zhou. It is true that the current code expects a tempdir object in multiple locations, but that doesn't have to stay that way.

We could define our own object that would selectively wrap tempdir or a selected directory under the same API and pass this object around instead of tempdir. If you choose to do this, please send the definition of our own object and the following refactor from tempdir in separate PRs, so their scope remains fairly small. This will allow us to discuss before we over-invest in an implementation.

mum4k avatar Jun 20 '22 16:06 mum4k

@xu1zhou Thanks for report your findings. I can understand your confusion, envoy-perf/salvo/src/lib/source_tree.py is used for both Nighthawk and Envoy building process, the object changes will also have affects on Envoy.

Looked at TemporaryDirectory function, can we use suffix or prefix arguments to identify whether it is the duplicate nighthawk building? If it doesn't work, as @mum4k mentioned, you can create own objects.

gyohuangxin avatar Jun 21 '22 01:06 gyohuangxin

@gyohuangxin tempobject with suffix/prefix arguments only joins specified suffix/prefix with random string as name implies. These objects are different with whole path like /tmp/salvo/prefix-8avnbwxt-suffix

@xu1zhou Thanks for report your findings. I can understand your confusion, envoy-perf/salvo/src/lib/source_tree.py is used for both Nighthawk and Envoy building process, the object changes will also have affects on Envoy.

Looked at TemporaryDirectory function, can we use suffix or prefix arguments to identify whether it is the duplicate nighthawk building? If it doesn't work, as @mum4k mentioned, you can create own objects.

xu1zhou avatar Jun 21 '22 02:06 xu1zhou

This issue has been automatically marked as stale because it has not had activity in the last 90 days. It will be closed in the next 14 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

github-actions[bot] avatar Nov 19 '22 08:11 github-actions[bot]

This issue has been automatically closed because it has not had activity in the last 104 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.

github-actions[bot] avatar Dec 04 '22 08:12 github-actions[bot]