pkgr must more aggressively error on failed download
when downloading packages, receiving a 404 or other issue where cannot download should not just print a warning and proceed, as the installation will obviously fail since that package does not exist
1388:{"level":"warning","msg":"bad server response","package":"cmprsk","status":"404 Not Found","status_code":404,"time":"2019-09-30T08:42:15-04:00","url":"https://metrumresear
chgroup.github.io/cran/2019-09-22/src/contrib/cmprsk_2.2-7.1.tar.gz"}
ERRO[0509] installation failed for packages: cmprsk
TRAC[0509] Resetting package environment
INFO[0510] duration:8m30.328218453s
FATA[0510] failed package install with err, failed installation for packages: cmprsk
Can you possibly add a Retry here:
https://github.com/metrumresearchgroup/pkgr/blob/develop/cran/download-package.go#L128
I added a log to record the reason for download fail
log.WithField("package", d.Package.Package).Warn(err)
And the failure I am seeing is:
INFO[0087] downloading package package=nycflights13
WARN[0087] http2: server sent GOAWAY and closed the connection; LastStreamID=1999, ErrCode=NO_ERROR, debug="" package=reticulate
WARN[0087] http2: server sent GOAWAY and closed the connection; LastStreamID=1999, ErrCode=NO_ERROR, debug="" package=caret
WARN[0087] downloading failed package=reticulate
WARN[0087] downloading failed package=caret
WARN[0087] http2: server sent GOAWAY and closed the connection; LastStreamID=1999, ErrCode=NO_ERROR, debug="" package=gt
WARN[0087] downloading failed package=gt
WARN[0087] http2: server sent GOAWAY and closed the connection; LastStreamID=1999, ErrCode=NO_ERROR, debug="" package=tsibble
WARN[0087] downloading failed package=tsibble
Wondering if perhaps, the server thinks its a DOS attack?
Thanks for the report Brian - what type of repo are you pulling from - gitlab pages? rstudio package manager? MRAN, other? There is pretty high concurrency of these requests.
This is on the queue to just redo anyway - like pkgr shouldn't even just warn, it should straight error since if you're failing to dl obv the install is going to go away.
What would be helpful is what could we add to also make this more robust - both some retr(ies) on the http end + potentially some concurrent_dl setting
INFO[0002] package installation sources
AmgenInternal=25
BioCann=1
BioCsoft=17
CRAN=994 ========> (https://cran.microsoft.com/snapshot/2021-10-08)
CRAN_20190901=1
CRAN_20200118=4
CRAN_20200510=2
H2O=1
INLA=1
glmmADMB_repo=1
tarballs=0
ok thanks - that CRAN server is.... not great... has had a number of outages. pkgr needs improvement here but in the meantime - some knowledge to drop - switch away from mran and instead rstudio has cran snapshotted every night now:

i've switched over and its been much better
url: https://packagemanager.rstudio.com/client/#/repos/1/overview
Ah, cool, let me give that a try. Yeah my build have been failing every for the past few days. I was about to go with a hack in your code here and put a 1 second sleep in place so that I don't spam that server. https://github.com/metrumresearchgroup/pkgr/blob/develop/cran/download-package.go#L124 It definitely seems like it thinks I am DOS'ing it. The one second delay on each round let it succeed.
I'll let you know how the RStudio snapshot works....
package manager is much slower (like it has 1-2 second built-in throttle), but my first attempt passed, so seems like a good work-around. Thanks!
I think MRAN, would be fine, but needs a download throttle....so maybe a workaround is to introduce a new property in the yml file for a per-repo download throttle in seconds. I think this is an issue only because I am up to 999 packages from there.
kk give #389 a try - that has an exponential backoff built in (thanks hashicorp) and the concurrency control knob - perhaps
PKGR_DL_CONCURRENCY=3 pkgr install ...
combined with the retries just in case will do the trick. Interesting that package manager was slower for you, we've seen that speed things up for us compared to MRAN - though to be fair, 99.99% of the time we point to mpn.metworx.com these days :-) (though i'm not sure if your entire snapshot is present in MPN)
No errors during the download phase using #389 on the MRAN link, so that that seems like a positive!
Regarding timing, RStudio Package Manager History Repo: 6m.47s to download all MRAN, 3m29s to download all.
PKGR_DL_CONCURRENCY=2 works as well, sufficiently throttling the concurrent downloads so that the last few packages don't start failing and hit the download retry. (Though at 2 threads, it's as slow as RS Package Manager.....6m27).
So 2 positive changes.
Poking this topic.
- Just an additional note, when packages passively fail to download upstream,
- installs seem to passively fail downstream (e.g. process does not exit with a non-zero exit code so that docker builds fail)
time="2022-02-03T04:35:15-08:00" level=info msg="Successfully Installed." package=libcoin remaining=660 repo=CRAN version=1.0-9
time="2022-02-03T04:35:16-08:00" level=error msg="cmd output" exit_code=1 output="Warning: invalid package ‘/opt/local/docker/installers/runtime/pkgr’\nError: ERROR: no packages specified\n" package=coda stderr="Warning: invalid package ‘/opt/local/docker/installers/runtime/pkgr’\nError: ERROR: no packages specified\n" stdout=
time="2022-02-03T04:35:16-08:00" level=warning msg="error installing" err="exit status 1"
time="2022-02-03T04:35:19-08:00" level=info msg="Successfully Installed." package=mc2d remaining=659 repo=CRAN version=0.1-21
Also, it seems that the final failure does not exit with a non-zero exit code:
time="2022-02-03T04:38:25-08:00" level=error msg="did not install IRdisplay"
time="2022-02-03T04:38:25-08:00" level=error msg="did not install distributional"
time="2022-02-03T04:38:25-08:00" level=error msg="did not install refund"
time="2022-02-03T04:38:25-08:00" level=error msg="installation failed for packages: ucminf, proto, svUnit, wavelets, entropy, coda, clue"
time="2022-02-03T04:38:25-08:00" level=info msg="starting individual tarball install"
time="2022-02-03T04:38:25-08:00" level=info msg="total package install time" duration=37m12.389199139s
time="2022-02-03T04:38:26-08:00" level=info msg="duration:37m13.64722306s"
time="2022-02-03T04:38:26-08:00" level=error msg="failed package install with err, failed installation for packages: ucminf, proto, svUnit, wavelets, entropy, coda, clue"
So it would be mainly nice to get a non-zero exit code so that automated builds fail properly