runner icon indicating copy to clipboard operation
runner copied to clipboard

hashFiles() couldn't finish within 120 seconds.

Open Carsten-MaD opened this issue 2 years ago • 20 comments

(filing here since I understand that hashFile() is owned by https://github.com/actions/runner)

Hey!

We have migrated a semi-large Unity project from SVN to Git / Github and I am working on setting up Actions for a build pipeline. Unfortunately, a step to create a cache for Unity-specific files is giving me a headache. If I understand the error message correctly, I am running into a 120 second timeout for that step.

As I see it, the timeout for hasFiles() is set here and there is no way to increase it, is that correct? https://github.com/actions/runner/blob/100c99f2635be35c7ea690c58fa7d203725cac9e/src/Runner.Worker/Expressions/HashFilesFunction.cs#L16

Expected behavior I imagine it makes a lot of sense that the job has a timeout. It would be helpful if it could be set from the step configuration for scenarios where a lot of files have to be hashed.

Runner Version and Platform

Runner version: 2.290.1 OS: Windows 10 64bit self hosted runner

Action configuration:

- uses: actions/cache@v3
  with:
    path: Library
    key: Library-${{ hashFiles('Assets\**', 'Packages\**', 'ProjectSettings\**') }}
    restore-keys: |
      Library-

Job Log Output

##[debug]Evaluating condition for step: 'Run actions/cache@v3'
##[debug]Evaluating: success()
##[debug]Evaluating success:
##[debug]=> true
##[debug]Result: true
##[debug]Starting: Run actions/cache@v3
##[debug]Register post job cleanup for action: actions/cache@v3
##[debug]Loading inputs
##[debug]Evaluating: format('Library-{0}', hashFiles('Assets\**', 'Packages\**', 'ProjectSettings\**'))
##[debug]Evaluating format:
##[debug]..Evaluating String:
##[debug]..=> 'Library-{0}'
##[debug]..Evaluating hashFiles:
##[debug]....Evaluating String:
##[debug]....=> 'Assets\**'
##[debug]....Evaluating String:
##[debug]....=> 'Packages\**'
##[debug]....Evaluating String:
##[debug]....=> 'ProjectSettings\**'
##[debug]Search root directory: 'D:\actions-runner\_work\[omitted]\[omitted]'
##[debug]Search pattern: 'Assets\**, Packages\**, ProjectSettings\**'
##[debug]Starting process:
##[debug]  File name: 'D:\actions-runner\externals\node16\bin\node.exe'
##[debug]  Arguments: '"D:\actions-runner\bin\hashFiles"'
##[debug]  Working directory: 'D:\actions-runner\_work\[omitted]\[omitted]'
##[debug]  Require exit code zero: 'False'
##[debug]  Encoding web name:  ; code page: ''
##[debug]  Force kill process on cancellation: 'False'
##[debug]  Redirected STDIN: 'False'
##[debug]  Persist current code page: 'False'
##[debug]  Keep redirected STDIN open: 'False'
##[debug]  High priority process: 'False'
##[debug]Process started with process id 10936, waiting for process exit.
##[debug]Match Pattern: Assets\**
##[debug]Packages\**
##[debug]ProjectSettings\**
##[debug]::debug::followSymbolicLinks 'false'
##[debug]::debug::followSymbolicLinks 'false'
##[debug]::debug::implicitDescendants 'true'
##[debug]::debug::omitBrokenSymbolicLinks 'true'
##[debug]::debug::Search path 'D:\actions-runner\_work\[omitted]\[omitted]\Assets'
##[debug]::debug::Search path 'D:\actions-runner\_work\[omitted]\[omitted]\Packages'
##[debug]::debug::Search path 'D:\actions-runner\_work\[omitted]\[omitted]\ProjectSettings'

[looong list of files that are processed, omitted]

##[debug]Sending CTRL_C to process 10936.
##[debug]Successfully send CTRL_C to process 10936.
##[debug]Waiting for process exit or 7.5 seconds after CTRL_C signal fired.
##[debug]Ignore Ctrl+C to current process.
##[debug]STDOUT/STDERR stream read finished.
##[debug]STDOUT/STDERR stream read finished.
##[debug]Process exit successfully.
##[debug]Process cancelled successfully through Ctrl+C/SIGINT.
##[debug]Process Cancellation finished.
##[debug]Finished process 10936 with exit code -1073741510, and elapsed time 00:02:00.0198012.
Error: .github/workflows/main.yml (Line: 43, Col: 16):
Error: The template is not valid. .github/workflows/main.yml (Line: 43, Col: 16): hashFiles('Assets\**, Packages\**, ProjectSettings\**') couldn't finish within 120 seconds.
##[debug]GitHub.DistributedTask.ObjectTemplating.TemplateValidationException: The template is not valid. .github/workflows/main.yml (Line: 43, Col: 16): hashFiles('Assets\**, Packages\**, ProjectSettings\**') couldn't finish within 120 seconds.
##[debug]   at GitHub.DistributedTask.ObjectTemplating.TemplateValidationErrors.Check()
##[debug]   at GitHub.DistributedTask.Pipelines.ObjectTemplating.PipelineTemplateEvaluator.EvaluateStepInputs(TemplateToken token, DictionaryContextData contextData, IList`1 expressionFunctions)
##[debug]   at GitHub.Runner.Worker.ActionRunner.RunAsync()
##[debug]   at GitHub.Runner.Worker.StepsRunner.RunStepAsync(IStep step, CancellationToken jobCancellationToken)
##[debug]Finishing: Run actions/cache@v3

Carsten-MaD avatar Apr 21 '22 09:04 Carsten-MaD

I have the same problem when using the cache action to cache Maven artifacts. I have 320 pom.xml files "only", in my repository. I wonder why it take more than 2 minutes to calculate the hash of these small files.

I call the action that way:

      - name: Cache Maven dependencies
        uses: actions/cache@v2
        with:
          path: ~/.m2/repository
          key: ${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }}
          restore-keys: |
            ${{ runner.os }}-maven-

And I'm running on a self-hosted Windows Server 2016.

uportalis avatar Apr 21 '22 11:04 uportalis

https://github.com/actions/runner/pull/1844

Carsten-MaD avatar Apr 22 '22 18:04 Carsten-MaD

We're also getting this error in our pipelines when using actions/cache. Did not use to happen in the past, started happening only last week. Tried using cache@v2 as well as cache@v3. We for example have a workflow with 5 jobs running at the same time, ran by a reusable workflow file. If we encounter this error, it always happens for the last job to finish out of these 5. Re-running the failed job sometimes helps on the first try, sometimes on the second or third retry.

MikulasMascautanu avatar May 31 '22 09:05 MikulasMascautanu

@MikulasMascautanu I want to say that my issue and what you describe are not necessarily of the same nature. In my case, it makes a lot of sense that hashing all the files would indeed take longer than two minutes: It is a rather large Unity game project. My issue is that I would like it to enable to take longer, also see the PR mentioned above.

Your case seems to be a bit different: Often it finishes in under two minutes, sometimes it does not. In such a case, there might be a different underlying reason, potentially with the mechanic that hashes the files itself.

Carsten-MaD avatar May 31 '22 11:05 Carsten-MaD

@Carsten-MaD you're right. Though I still think this optional timeout setting could be benefitial to actions/cache. I will post my original comment into the actual actions/cache ticket. Thanks!

MikulasMascautanu avatar May 31 '22 14:05 MikulasMascautanu

@Carsten-MaD did you manage to sort this? I'm seeing a similar issue on my Podfile.lock file.

shtefanilie avatar Jul 14 '22 16:07 shtefanilie

Besides waiting for the PR to be merged (or even checked out), can anyone think of a workaround for this?

JJ avatar Jul 20 '22 07:07 JJ

@JJ What I personally did, was to remove entirely the actions/cache steps from my flow. Otherwise, my release would've still been blocked. What was very weird was that it started with Podfile.lock issues, and after removing that one, the yarn.lock caching failed soon after. So in the end I had to remove all of them. Sadly it seems nobody is interested in that PR @Carsten-MaD opened...

shtefanilie avatar Jul 20 '22 08:07 shtefanilie

@shtefanilie my issue with this also started with Podfile.lock issues then I realized that GHA was using the cached version of my Podfile. I had to force the cache to get busted by changing the hash name like "-pods-v2" so GHA would see the cache doesn't exist and then recreate the Podfile.lock file. Just leaving it here as it could help others too 😊

pedpess avatar Jul 20 '22 08:07 pedpess

@JJ What I personally did, was to remove entirely the actions/cache steps from my flow. Otherwise, my release would've still been blocked. What was very weird was that it started with Podfile.lock issues, and after removing that one, the yarn.lock caching failed soon after. So in the end I had to remove all of them. Sadly it seems nobody is interested in that PR @Carsten-MaD opened...

Yep, makes sense. That's what I have done too. And yes, that's sad.

JJ avatar Jul 20 '22 08:07 JJ

@pedpess yeah, that might be a way to do it. Might be worth it see if I add back the caching, wether if that fixes the issue

shtefanilie avatar Jul 20 '22 10:07 shtefanilie

also experiencing this issue as of 7/29.

key: ${{ runner.os }}-pods-${{ inputs.cache_version }}-${{ hashFiles('**/Podfile.lock') }}

is timing out. current work around is to remove cache which is not ideal.

kroakdammit avatar Jul 29 '22 23:07 kroakdammit

Also seeing this while hashing a small conda environment file

wpbonelli avatar Aug 11 '22 11:08 wpbonelli

I see this almost everyday on my environment, and i have (more or less) 3 pom.xml files on the repo...

fl250144 avatar Aug 19 '22 17:08 fl250144

I had that exact error message, too, only on Windows. But in my case it was happening not in the Run actions/cache@v2 step, but at the end of the job in the Post actions/cache@v2 step.

Which was counter-intuitive to me: given that it computed the hashFiles in the beginning, I was not expecting it to compute it a second time. And what happens is that I hashFiles('./some_folder'), but then I build stuff there, so many files get added.

My solution was therefore just to make sure that my job does not write anything into <some_folder>, so that the hash stays easy to compute (and the same between the beginning and the end of the job).

I'm fairly sure that other platforms don't recompute the hashFiles for the Post actions/cache@v2 step, otherwise I would have the same issue (and the key would always change). It only ever happened on Windows for me.

JonasVautherin avatar Sep 07 '22 21:09 JonasVautherin

@JJ: Assuming that the runtime is performing hashFiles() at step evaluation, you should be able to perform your own hashing on disk in a step...

Old:

      - name: Cache Maven dependencies
        uses: actions/cache@v2
        with:
          path: ~/.m2/repository
          key: ${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }}
          restore-keys: |
            ${{ runner.os }}-maven-

Replacement:

      - name: Workaround hashFiles timeout
        run: |
          find ~/.m2/repository -name pom.xml -print0 | xargs -0 shasum > ~/.m2/repository/pom.xml.shasum
      - name: Cache Maven dependencies
        uses: actions/cache@v2
        with:
          path: ~/.m2/repository
          key: ${{ runner.os }}-maven-${{ hashFiles('pom.xml.shasum') }}
          restore-keys: |
            ${{ runner.os }}-maven-
    ...
      - name: Workaround hashFiles timeout update
        run: |
          find ~/.m2/repository -name pom.xml -print0 | xargs -0 shasum > ~/.m2/repository/pom.xml.shasum

Disclaimers:

  1. This isn't tested
  2. find, xargs, shasum might not be available for your OS/Shell, as applicable you may need to replace the commands...
  3. Please don't randomly trust random people's code w/o inspecting / considering it 😄

jsoref avatar Jan 29 '23 08:01 jsoref

any update?

himself65 avatar Jun 28 '23 11:06 himself65

Still waiting for an update on this, this more often than not provides a false positive in our runs, would love to have an option to simply increase or remove the 120 second limit.

IgnacioBecerra avatar Jul 10 '23 22:07 IgnacioBecerra

Bump. Came across this issue repeatedly. Please merge and release the fix soon!

udayraj-rzp avatar Sep 14 '23 10:09 udayraj-rzp

Alternative (key based on Git commit hash)

If you're ok with using the last Git commit hash modifying a certain directory, you can avoid the expensive operation of hashing all files - the git log command is pretty much instant, and fetching the full history of your repository may be faster than the hashing operation.

Configure your actions/checkout to fetch the full history (or a limited number of commits) and add an extra step to set a cache key based on the last commit to affect a certain directory:

name: Skip cache restoration on changes in directory
on: [push]
jobs:
  build:
    runs-on: ubuntu-latest
    timeout-minutes: 10
    steps:
    - name: Checkout repository
      uses: actions/checkout@v4
      with:
        # Fetch all commits in all branches and tags
        fetch-depth: 0

    - name: Get last Git commit hash modifying packages/abc
      run: |
        echo "ABC_HASH=$(git log -1 --pretty=format:%H -- packages/abc)" >> $GITHUB_ENV

    - name: Cache packages/abc
      uses: actions/cache@v4
      with:
        path: packages/abc
        key: abc-build-cache-${{ env.ABC_HASH }}

    - name: Build packages/abc
      run: |
        pnpm --filter=abc build

Also added this to github-tricks just now:

  • https://github.com/karlhorky/github-tricks/blob/main/readme.md#github-actions-configure-actionscache-to-skip-cache-restoration-on-changes-in-directory

karlhorky avatar May 05 '24 15:05 karlhorky