setup-dotnet icon indicating copy to clipboard operation
setup-dotnet copied to clipboard

extremely slow Network/Disk IO on Windows agent compared to Ubuntu/Mac

Open jetersen opened this issue 3 years ago • 24 comments

Description:

https://github.com/actions/virtual-environments/issues/3577

Ubuntu agents have slightly higher IOPS disk performance configuration. We use install-dotnet.ps1 script for installation provided by DotNet team . The DownloadFile and Extract-Dotnet-Package functions are being slow. We will investigate how to improve performance those functions if we replace DownloadFile -> WebClient and Extract-Dotnet-Package -> 7zip

DownloadFile and Extract-Dotnet-Package are awfully slow. Like 3x slower!

SLOW

Task version: v1.9.0

Platform:

  • [ ] Ubuntu
  • [ ] macOS
  • [x] Windows

Runner type:

  • [x] Hosted
  • [ ] Self-hosted

Repro steps:
https://github.com/jetersen/dotnet.restore.slow.github.action

Expected behavior: Faster downloads

Actual behavior: SLOW downloads

jetersen avatar Jan 17 '22 07:01 jetersen

It can be fast:

FAST

jetersen avatar Jan 17 '22 08:01 jetersen

Perhaps consider not using dotnet install script or is the option to contribute a fix to dotnet install script?

jetersen avatar Jan 17 '22 08:01 jetersen

Hi @jetersen, we will try to resolve this problem.

vsafonkin avatar Jan 26 '22 10:01 vsafonkin

Hi Team - any news on this?

PureKrome avatar Aug 03 '22 11:08 PureKrome

Hello @PureKrome,

So far no updates

e-korolevskii avatar Aug 03 '22 12:08 e-korolevskii

The problem is not reproduced anymore.

Based on multiple runs, the action does not take more than 15 seconds. https://github.com/akv-demo/dotnet.restore.slow.github.action/actions/runs/6729109167

Most probably, the root cause of the problem was an infrastructure issue that has currently been resolved.

In case the problem reoccurs, the solution is to avoid bulk copying to the OS drive, similar to the workaround applied for the same problem in the actions/setup-go: https://github.com/actions/setup-go/pull/393

@jetersen did it help?

dsame avatar Nov 02 '23 06:11 dsame

@dsame I do not agree with the assessment that it is not reproducible 😓 Even with cache available Windows Server 2022 is still 20 seconds slower. Creating the cache is still 1 minute longer than time of Ubuntu when running Windows Server 2022.

So definitely an improvement but I feel like windows can perform better.

image image

https://github.com/jetersen/dotnet.restore.slow.github.action/actions/runs/6736238225 https://github.com/jetersen/dotnet.restore.slow.github.action/actions/runs/6736262624

jetersen avatar Nov 02 '23 17:11 jetersen

I don't think it is fair to say look it is fixed for the actions/setup-dotnet when we are talking about a simple if check to see if .NET 6 is already available on the actions runner image 😓

Testing with .NET 8 preview shows ubuntu with 7 seconds vs +30 seconds sometime a little less. For actions/setup-dotnet.

https://github.com/jetersen/dotnet.restore.slow.github.action/actions/runs/6736383956/job/18311632176

While this issue remains open: https://github.com/actions/setup-dotnet/issues/141 this will definitely not improve 😢

jetersen avatar Nov 02 '23 17:11 jetersen

Thank you @jetersen for pointing out .NET 8 issue, it's confirmed

https://github.com/akv-demo/dotnet.restore.slow.github.action/actions/runs/6742358650/job/18328332807

dsame avatar Nov 03 '23 07:11 dsame

@dalyIsaac interesting approach, does that really save that much 🤔

jetersen avatar Nov 03 '23 12:11 jetersen

I'm fairly happy with the gains I've seen, but admittedly I didn't conduct a very rigorous study.

sample # jobs mean median sample std dev
Installing on C:\ 16 02:16 02:27 00:30
Caching[^1] on C:\ 4 01:52 01:42 00:35
Installing on D:\ 12 01:37 01:34 00:24
Caching on D:\ 12 01:07 01:07 00:15

[^1]: Caching includes the actual caching and running dotnet restore. Cache sizes were about 700MB.

dalyIsaac avatar Nov 04 '23 04:11 dalyIsaac

Hello @jetersen

The quick fix is to set environment variable DOTNET_INSTALL_DIR to the value pointing to some path on the "D:" drive.

https://github.com/akv-demo/dotnet.restore.slow.github.action/commit/45e801aefc79dfcec0bab6bb25d9f2972dda5770#diff-b803fcb7f17ed9235f1e5cb1fcd2f5d3b2838429d4368ae4c57ce4436577f03fR15

This workaround is proven to solve the problem https://github.com/akv-demo/dotnet.restore.slow.github.action/actions/runs/6768243557/job/18392290993 and can be used until the action fix is available

dsame avatar Nov 06 '23 08:11 dsame

@dsame perhaps some of these fixes should be raised with @actions/runner-images? I assume we are hitting similar IO restrictions on the Windows images as this affects all windows based hosted runners 🫠

jetersen avatar Nov 06 '23 15:11 jetersen

Hello @jetersen, generally it is a good idea but i doubt any of actions team can solve the problem with the infrastructure and most probably the infrastructure problem is not expected to be solved in the acceptable timeframe.

dsame avatar Nov 07 '23 00:11 dsame

but i doubt any of actions team can solve the problem with the infrastructure

why is this? because these are 2 independent teams within GH and even though the actions team could make some changes based on this thread (which would benefit all users, by default) ... the intra-team still need to also do some changes but you're suggesting that this is a low priority so .. they go 'meh' ?

PureKrome avatar Nov 07 '23 01:11 PureKrome

Created https://github.com/actions/runner-images/issues/8755 in hoping that we can find a generic solution. I was hoping they could simply change the disk setup on the windows packer scripts 🤔

jetersen avatar Nov 07 '23 06:11 jetersen

hi

blackstars701 avatar Dec 12 '23 14:12 blackstars701