arrow icon indicating copy to clipboard operation
arrow copied to clipboard

[R] Treat builds on R-universe as like `NOT_CRAN`

Open eitsupi opened this issue 1 year ago • 5 comments

Describe the enhancement requested

Related to #43030

Binary builds on CRAN are unreliable, and builds on R-universe will be faster and more versatile if we can use libarrow with feature flags enabled. Since R-universe sets an environment variable called MY_UNIVERSE, I am wondering if it is possible to check this environment variable to enable binary downloads. Normally, downloading binaries of unknown origin would not be very welcome, but I think the advantages outweigh the disadvantages here.

Here is an example of the polars package that have such a mechanism. https://github.com/pola-rs/r-polars/blob/77bba3cf53a0fbaee139514b21b8d8ed0a087f6d/configure#L52-L61

Example of CRAN binary problem: the latest arrow package is not built for macOS x86_64 on CRAN, so CI takes almost 30 minutes if we have a dependency on the arrow package on R-universe. nanoarrow passed CI in 2 minutes on other platforms, but on macOS x86_64 it took 30 minutes due to libarrow source build. https://github.com/r-universe/r-multiverse/actions/runs/10774098598/job/29875540102

Component(s)

R

eitsupi avatar Sep 09 '24 13:09 eitsupi

I think that's reasonable but arrow is also on r-universe, why not use that within r-universe for pre-built binaries?

downloading binaries of unknown origin

Just for the record I'll have to object to that phrasing ^^ the binaries we dowload are checksum'd and artifacts created during the arrow release.

assignUser avatar Sep 09 '24 15:09 assignUser

I think that's reasonable but arrow is also on r-universe, why not use that within r-universe for pre-built binaries?

Sorry my expression is not clear. My point is that source builds in CI in R-universe are slow, I just thought it would be more reasonable to download libarrow since R-universe is not CRAN and does not prohibit downloading binaries from outside.

Just for the record I'll have to object to that phrasing ^^ the binaries we dowload are checksum'd and artifacts created during the arrow release.

I know Apache releases are strict, but I thought the arrow package did not validate the downloaded binaries. Sorry if this was a misunderstanding.

eitsupi avatar Sep 09 '24 15:09 eitsupi

My point is that source builds in CI in R-universe are slow,

No, I understood that, my question was: why not use the package binaries for arrow that r-universe provides instead of building from source?

assignUser avatar Sep 09 '24 15:09 assignUser

why not use the package binaries for arrow that r-universe provides instead of building from source?

Since R-universe provides a separate repository for each user, it is irrelevant to one user whether the arrow package exists in another user's repository.

In the case of the nanoarrow on r-multiverse example above, there was an arrow 17.0.0 binary in the R-universe of r-multiverse that was tied to this repository release, but CRAN had a 17.0.0.1 source package, so the latter may have taken precedence over the former.

eitsupi avatar Sep 09 '24 15:09 eitsupi

Since R-universe provides a separate repository for each user

Ah I see. It should be an easy enough change to enable it similar to NOT_CRAN :)

assignUser avatar Sep 09 '24 16:09 assignUser

Issue resolved by pull request 44476 https://github.com/apache/arrow/pull/44476

assignUser avatar Oct 24 '24 22:10 assignUser