mxnet icon indicating copy to clipboard operation
mxnet copied to clipboard

Flaky CI step1- cannot clear workspace directory

Open DickJC123 opened this issue 2 years ago • 1 comments

Description

I'm seeing CI jobs fail at the first step Recursively delete the current directory from the workspace. Log output is:

 java.nio.channels.ClosedChannelException

Seems unrelated to the specific PR.

Occurrences

https://jenkins.mxnet-ci.com/blue/organizations/jenkins/mxnet-validation%2Fcentos-gpu/detail/PR-21104/5/pipeline https://jenkins.mxnet-ci.com/blue/organizations/jenkins/mxnet-validation%2Fcentos-gpu/detail/PR-21104/7/pipeline/42

What have you tried to solve it?

  1. Retry job to bypass issue.

DickJC123 avatar Aug 02 '22 23:08 DickJC123

I've noticed this happen at random as well. It seems it only happens when Jenkins has a large number of worker nodes (and thus jobs) running at the same time. Unfortunately, I haven't been able to root-cause the issue yet.

josephevans avatar Aug 04 '22 03:08 josephevans