rhub
rhub copied to clipboard
rHub Solaris different from CRAN Solaris?
Ah Solaris, every R package maintainer's favourite vintage OS - especially if you are using lots of C++ and/or Unicode encodings.
I just updated quanteda and the checks show v2.1.2 is breaking on Solaris on the CRAN results.
On rHub, though, it's clean:
quanteda 2.1.2: NOTE
Build ID: quanteda_2.1.2.tar.gz-08835e9ba4414e1cac78283d9806dc0b
Platform: Oracle Solaris 10, x86, 32 bit, R-release
Submitted: 48 minutes 31.5 seconds ago
Build time: 48 minutes 27 seconds |
NOTES:
* checking data for non-ASCII characters ... NOTE Note: found 3 marked UTF-8 strings
Any ideas as to what causes the difference, and how I might get the correct check in advance?
I don't think this NOTE prevents CRAN from publishing the package. I don't know why CRAN Solaris is not catching this.
Oh, that's the clean check. :)
Is CRAN using the GNU or the Solaris compilers for your package?
looks like Solaris:
using R version 4.0.2 Patched (2020-09-21 r79235) using platform: i386-pc-solaris2.10 (32-bit)
That's the platform only, which is the same for both. But seems like you are using Rcpp, so they are probably using the GNU compilers.
The error seems to be in the R code, anyway.
Calls: summary ... stri_detect_regex -> .handleSimpleError -> h -> .handleSimpleError -> h
So maybe the ICU is different? Or maybe it is some encoding issue? Unfortunately I don't really know how to debug CRAN's machine.
You could try telling them that this works on your Solaris machine, and ask for more info, e.g. the ICU version.
Yes, it's almost certainly the ICU version. Updates via stringi seem to be very platform dependent, as we discovered in https://github.com/quanteda/quanteda/issues/1996.
We are struggling to debug CRAN's Solaris machine too. Unfortunately it seems to be pretty unique. Thanks for trying to help.
CRAN probably special cases stringi on Solaris, and installs it with --disable-cxx11
. See https://svn.r-project.org/R-dev-web/trunk/CRAN/QA/BDR/Solaris/x86/packages/tests32/swift.R
I am not sure why they do this, because both the Solaris and the GNU compilers support C++11.
To try to reproduce this, you can get a ready to use Solaris VM (OVA file) with R from here: https://files.r-hub.io/solaris/ There is one file for VirtualBox and another for VMWare.
More instructions here: https://github.com/r-hub/solarischeck/tree/master/packer#updating-r
You can install stringi with the --disable-cxx11
option and see if this installs/uses an older ICU.
Alternatively you can special case your tests/examples to only run if a recent ICU is available.
Edit: I'll also email CRAN to ask why they special case stringi.
I just tried this, with --disable-cxx11
stringi installs ICU 5.5, and you get this error:
> corp1 <- corpus(data_char_ukimmig2010[1:2])
> corp2 <- corpus(data_char_ukimmig2010[3:4])
> corp3 <- corpus(data_char_ukimmig2010[5:6])
> summary(c(corp1, corp2, corp3))
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'x' in selecting a method for function 'as.list': error in evaluating the argument 'x' in selecting a method for function 'which': Illegal argument. (U_ILLEGAL_ARGUMENT_ERROR, context=`^\p{emoji_presentation}+$`)
Without that option stringi installs ICU 6.1, and it works fine:
> corp1 <- corpus(data_char_ukimmig2010[1:2])
> corp2 <- corpus(data_char_ukimmig2010[3:4])
> corp3 <- corpus(data_char_ukimmig2010[5:6])
> summary(c(corp1, corp2, corp3))
Corpus consisting of 6 documents, showing 6 documents:
Text Types Tokens Sentences
BNP 1125 3280 88
Coalition 142 260 4
Conservative 251 499 15
Greens 322 677 21
Labour 298 680 29
LibDem 251 483 14
I think in general it makes sense to check for the ICU version in quanteda, even if CRAN fixes this, because there might be other installation with older ICU versions around.
@kbenoit It seems that you can check the ICU version in stringi like this:
> stringi::stri_info()[["ICU.version"]]
[1] "61.1"
Yes we have an open issue to deal with ICU versions, but now we have a good reason to move on that. I'll install the VM you mentioned and use that to fix this. Thanks so much! You 🎸!
Cool. Btw. I had to do this on the VM to install quanteda:
sudo pkgutil -y -i r_base
sudo pkgutil -i -y CSWlibxml2-dev
export MAKE=gmake
R
Some dependencies need GNU make. Let me know if you run into issues.
CRAN's Solaris is gone, so I'll close this.
It finally happened! Not with a bang but a whimper. Good riddance.