azure-sdk-for-net icon indicating copy to clipboard operation
azure-sdk-for-net copied to clipboard

Add trailing glob to improve sparse checkout performance

Open benbp opened this issue 3 years ago • 6 comments

benbp avatar Sep 09 '22 20:09 benbp

Some of the sparse checkouts are still almost 2 mins, see https://dev.azure.com/azure-sdk/public/_build/results?buildId=1842107&view=logs&j=d3fdcdd1-7a8e-5668-5e2b-ac9753b27d6a&t=96ea04e2-37b6-5a66-a3dd-f25fc0d42970. Is that still expected?

weshaggard avatar Sep 09 '22 21:09 weshaggard

Perhaps the negative filter is causing an issue. Let me play around. But in general, our sparse checkouts have been taking a lot longer than before even with optimization. I'm not sure if this is a github server side perf issue or a repo growth issue yet.

benbp avatar Sep 09 '22 21:09 benbp

@benbp Didn't the recent git client change also potentially change the perf of the cone option. It might be worth experimenting with their new recommendations.

weshaggard avatar Sep 09 '22 21:09 weshaggard

@benbp Ben Broderick Phillips FTE Didn't the recent git client change also potentially change the perf of the cone option. It might be worth experimenting with their new recommendations.

It shouldn't have, it was just the impetus for the change was to address the worst case performance of non cone, but the behavior of non cone should still be the same from the client side as far as I know.

With our current usage, using cone mode is not an option because we don't know the paths in advance to include, e.g. when we want to check out all markdown files, and it doesn't support exclusions from what I can tell.

benbp avatar Sep 12 '22 18:09 benbp

Some of the sparse checkouts are still almost 2 mins, see https://dev.azure.com/azure-sdk/public/_build/results?buildId=1842107&view=logs&j=d3fdcdd1-7a8e-5668-5e2b-ac9753b27d6a&t=96ea04e2-37b6-5a66-a3dd-f25fc0d42970. Is that still expected?

@weshaggard so the sparse checkout will still take a long time for core, since it downloads all code paths and just excludes the recordings. Testing locally that's 1.1 GB

benbp avatar Sep 15 '22 22:09 benbp

With the work that @sima-zhu recently did we may want to consider splitting up the paths based on the chunking as well to help with some of this time. For now though I guess we will need to remain at 2-3 mins for the core clones.

weshaggard avatar Sep 19 '22 22:09 weshaggard

The build for the PR has been deleted. Is there any proof for the improvement?

We can find the sdk paths in prop list file. We can achieve this by 1. adding another step in ci.tests.yml before sparse-checkout, or 2. having another property in platform-matix.json for all the path.

sima-zhu avatar Sep 22 '22 21:09 sima-zhu