LightGBM icon indicating copy to clipboard operation
LightGBM copied to clipboard

WIP: [ci] fix R 3.6 Windows build (fixes #5036)

Open jameslamb opened this issue 2 years ago • 7 comments

Attempting to investigate #5036.

First, I'm pushing a commit just re-running all the R CI to see if the same error shows up.

After that, I'll start investigating to try to find the root cause of new failures, like making logs more verbose. This might be a slow process involving a lot of just printing this to logs, since https://github.com/nelsonjchen/reverse-rdp-windows-github-actions is no longer available.

jameslamb avatar Feb 28 '22 00:02 jameslamb

I see the same error on https://github.com/microsoft/LightGBM/runs/5353387904?check_suite_focus=true as reported in #5036.

Pushed https://github.com/microsoft/LightGBM/pull/5037/commits/e93cb856cc6e232e8f20176ade8bbd1b3c5a9db0, adding more logs to try to understand what happened.

jameslamb avatar Feb 28 '22 01:02 jameslamb

I think that we have tar issues again with Windows builds. See the third item in https://github.com/microsoft/LightGBM/pull/3946#pullrequestreview-799415812 for context.

In the logs for the most recent build (link), I see the following

removing object files created by vignettes
untarring lightgbm_3.3.2.99.tar.gz
done untarring lightgbm_3.3.2.99.tar.gz
re-tarring lightgbm_3.3.2.99.tar.gz
Running R CMD check
* using log directory 'C:/tmp-r-cmd-check/lightgbm.Rcheck'

Note that the line "Done creating lightgbm_3.3.2.99.tar.gz" from that section of build-cran-package.sh is not shown.

https://github.com/microsoft/LightGBM/blob/01568cf59a412de7823b373fe13eb36db9cffe53/build-cran-package.sh#L207-L221

I just pushed https://github.com/microsoft/LightGBM/pull/5037/commits/2c0e362cb4425a095c58bd5c4c57bc20604d5c30 to try printing the locations of tar, gzip, and gunzip.

That commit also reverts https://github.com/microsoft/LightGBM/pull/5037/commits/e93cb856cc6e232e8f20176ade8bbd1b3c5a9db0... I realize now that https://github.com/microsoft/LightGBM/pull/5037/commits/e93cb856cc6e232e8f20176ade8bbd1b3c5a9db0 wouldn't have helped anyway, since it only affects CMake-based builds and the problem is related to CRAN builds.

jameslamb avatar Mar 01 '22 18:03 jameslamb

try printing the locations of tar, gzip, and gunzip.

For the failing R 3.6 CRAN build (link), I see

--- location of tar ---
CommandType     Name                                               Version    Source
-----------     ----                                               -------    ------
Application     tar.exe                                            0.0.0.0    C:\Rtools\bin\tar.exe
--- location of gzip ---
Application     gzip.exe                                           0.0.0.0    C:\msys64\usr\bin\gzip.exe
--- location of gunzip ---
Application     gunzip                                             0.0.0.0    C:\msys64\usr\bin\gunzip

For the succeeding R 4.x CRAN build (link), I see that all three utilities are coming from Rtools

CommandType     Name                                               Version    Source
-----------     ----                                               -------    ------
Application     tar.exe                                            0.0.0.0    C:\rtools40\usr\bin\tar.exe
--- location of gzip ---
Application     gzip.exe                                           0.0.0.0    C:\rtools40\usr\bin\gzip.exe
--- location of gunzip ---
Application     gunzip                                             0.0.0.0    C:\rtools40\usr\bin\gunzip

I wonder if the root issue is some combination of:

  • mixing different builds of those tools can cause problems
  • some recent change to the windows-latest environment in GitHub actions remove C:\msys64\usr\bin\tar.exe

I found that some other utilities provided by the MSYS2 distribution used in these images are available at C:\msys64\mingw64\bin.

https://github.com/actions/virtual-environments/blob/3b5c4ebd39ce3d5812e130938bb066f67d90b54e/images/win/scripts/Installers/Install-Msys2.ps1#L75-L76

I just pushed https://github.com/microsoft/LightGBM/pull/5037/commits/3e1d5f8b37a667672ee9c110ba0ccc8e9c4cb455, to see if adding that to PATH helps.

jameslamb avatar Mar 01 '22 18:03 jameslamb

I tried just using the tools bundled with Rtools35 (by ensuring that Rtools paths are at the beginning of PATH) and unfortunately there is not a gunzip there.

(failing R 3.6 CRAN build link)

CommandType     Name                                               Version    Source
-----------     ----                                               -------    ------
Application     tar.exe                                            0.0.0.0    C:\Rtools\bin\tar.exe
--- location of gzip ---
Application     gzip.exe                                           0.0.0.0    C:\Rtools\bin\gzip.exe
--- location of gunzip ---
Get-Command: D:\a\LightGBM\LightGBM\.ci\test_r_package_windows.ps1:180
Line |
 180 |      Get-Command gunzip
     |      ~~~~~~~~~~~~~~~~~~
     | The term 'gunzip' is not recognized as a name of a cmdlet, function, script file, or executable
     | program. Check the spelling of the name, or if a path was included, verify that the path is correct
     | and try again.

There is one in newer versions of RTools.

(successful R4.X CRAN build link)

CommandType     Name                                               Version    Source
-----------     ----                                               -------    ------
Application     tar.exe                                            0.0.0.0    C:\rtools40\usr\bin\tar.exe
--- location of gzip ---
Application     gzip.exe                                           0.0.0.0    C:\rtools40\usr\bin\gzip.exe
--- location of gunzip ---
Application     gunzip                                             0.0.0.0    C:\rtools40\usr\bin\gunzip

jameslamb avatar Mar 03 '22 15:03 jameslamb

Ok I was able to get a few more logs from the failing line. (build link)

re-tarring lightgbm_3.3.2.99.tar.gz
lightgbm/
tar (child)lightgbm/build/
lightgbm/build/vignette.rds
lightgbm/cleanup
lightgbm/configure
: gzip: Cannot exec: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: lightgbm_3.3.2.99.tar.gz: Cannot write: Broken pipe
tar: Child returned status 2
tar: Error is not recoverable: exiting now
Running R CMD check

That error message is what I'd expect to see if gzip wasn't installed at all, but I can tell from the logs of Get-Command that it is!

I just pushed https://github.com/microsoft/LightGBM/pull/5037/commits/ab59cfb71534d65562af05c7b3656a08382630ee adding -z to the original tar command (based on this very old Stack Overflow answer, https://stackoverflow.com/questions/9749466/cant-untar-a-complete-directory-using-tar-cvpzf#comment12404275_9749491), but that did not work.

I'm traveling right now and will be away for the next 8 hours or so, sorry I'm not able to devote more time to this right now.

jameslamb avatar Mar 03 '22 16:03 jameslamb

Hi, @jameslamb is the CI job normal now?

guolinke avatar Jul 23 '22 14:07 guolinke

is the CI job normal now?

@guolinke I will come back to this draft PR once LightGBM's CI is fixed (https://github.com/microsoft/LightGBM/pull/5362#issuecomment-1192780327). I had been prioritizing other things, like getting R 4.2 support added (#5274).

jameslamb avatar Jul 25 '22 00:07 jameslamb

to make things easier for reviewers, I'm closing this PR with lots of debugging commits and comments. It is replaced by #5479.

jameslamb avatar Sep 10 '22 16:09 jameslamb

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

github-actions[bot] avatar Aug 19 '23 03:08 github-actions[bot]