Can no longer read NC files using `read_stars()`
Hi all,
I am trying to rerun old code using stars and it no longer works on one of my machine. The issue seems that one setup is no longer able to read (some?) NC files.
Below you will see the two setup compared. It uses the following file: https://www.dropbox.com/t/6MA5qGvcM1GCy1kx (please download for debugging purposes only, not for use)
The first setup works, the second does create a stars object but it is empty of data. The version of stars is the same on both system, but I can see that the version of GDAL and PROJ are more recent on the system that does not work.
My questions are:
- is the problem expected?
- how can I make sure it works with recent system libraries?
- is there any hope for me to create code that will remain relatively stable in time?
> stars::read_stars("gfdl-esm4_r1i1p1f1_w5e5_ssp126_tas_lat27.0to72.0lon-13.0to56.0_daily_2015_2020.nc")
stars_proxy object with 1 attribute in 1 file(s):
$`gfdl-esm4_r1i1p1f1_w5e5_ssp126_tas_lat27.0to72.0lon-13.0to56.0_daily_2015_2020.nc`
[1] "gfdl-esm4_r1i1p1f1_w5e5_ssp126_tas_lat27.0to72.0lon-13.0to56.0_daily_2015_2020.nc"
dimension(s):
from to offset delta refsys x/y
x 1 720 -180 0.5 NA [x]
y 1 360 90 -0.5 NA [y]
time 1 2192 2015-01-01 UTC 1 days POSIXct
> library(stars)
Loading required package: abind
Loading required package: sf
Linking to GEOS 3.12.2, GDAL 3.8.5, PROJ 9.3.1; sf_use_s2() is TRUE
WARNING: different compile-time and runtime versions for GEOS found:
Linked against: 3.12.2-CAPI-1.18.2 compiled against: 3.12.1-CAPI-1.18.1
It is probably a good idea to reinstall sf (and maybe lwgeom too)
> packageVersion("stars")
[1] ‘0.6.6’
> stars::read_stars("gfdl-esm4_r1i1p1f1_w5e5_ssp126_tas_lat27.0to72.0lon-13.0to56.0_daily_2015_2020.nc")
stars_proxy object with 1 attribute in 1 file(s):
$`gfdl-esm4_r1i1p1f1_w5e5_ssp126_tas_lat27.0to72.0lon-13.0to56.0_daily_2015_2020.nc`
[1] "[...]/gfdl-esm4_r1i1p1f1_w5e5_ssp126_tas_lat27.0to72.0lon-13.0to56.0_daily_2015_2020.nc"
dimension(s):
from to offset delta refsys x/y
x 1 720 -180 0.5 NA [x]
y 1 360 90 -0.5 NA [y]
time 1 1 <NA> NA POSIXct
Warning message:
In parse_netcdf_meta(meta_data, get_names(x)) : NAs introduced by coercion
> library(stars)
Loading required package: abind
Loading required package: sf
Linking to GEOS 3.12.2, GDAL 3.9.2, PROJ 9.4.1; sf_use_s2() is TRUE
> packageVersion("stars")
[1] ‘0.6.6’
I'm seeing
> stars::read_stars("gfdl-esm4_r1i1p1f1_w5e5_ssp126_tas_lat27.0to72.0lon-13.0to56.0_daily_2015_2020.nc")
stars_proxy object with 1 attribute in 1 file(s):
$`gfdl-esm4_r1i1p1f1_w5e5_ssp126_tas_lat27.0to72.0lon-13.0to56.0_daily_2015_2020.nc`
[1] "gfdl-esm4_r1i1p1f1_w5e5_ssp126_tas_lat27.0to72.0lon-13.0to56.0_daily_2015_2020.nc"
dimension(s):
from to offset delta refsys x/y
x 1 720 -180 0.5 NA [x]
y 1 360 90 -0.5 NA [y]
time 1 2192 2015-01-01 UTC 1 days POSIXct
> library(stars)
Loading required package: abind
Loading required package: sf
Linking to GEOS 3.12.1, GDAL 3.9.2, PROJ 9.4.0; sf_use_s2() is TRUE
> packageVersion("stars")
[1] '0.6.6'
where I don't believe that the PROJ version matters. What is your sessionInfo()? This is
> sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.2 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
locale:
[1] C
time zone: Etc/UTC
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] stars_0.6-6 sf_1.0-17 abind_1.4-5
loaded via a namespace (and not attached):
[1] e1071_1.7-14 magrittr_2.0.3 KernSmooth_2.23-24 parallel_4.4.1
[5] classInt_0.4-10 cli_3.6.3 grid_4.4.1 DBI_1.2.3
[9] proxy_0.4-27 class_7.3-22 compiler_4.4.1 tools_4.4.1
[13] Rcpp_1.0.13 rlang_1.1.4 units_0.8-5
On the system for which the reading does not work:
> sessionInfo()
R version 4.3.3 (2024-02-29)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Fedora Linux 40 (Forty)
Matrix products: default
BLAS: /mnt/smb2/courtiol/.conda/envs/r_viaconda/lib/libblis.so.4.0.0
LAPACK: /mnt/smb2/courtiol/.conda/envs/r_viaconda/lib/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: Europe/Berlin
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] e1071_1.7-14 magrittr_2.0.3 abind_1.4-5 KernSmooth_2.23-24 parallel_4.3.3 classInt_0.4-10 sf_1.0-16 cli_3.6.3
[9] grid_4.3.3 DBI_1.2.3 proxy_0.4-27 class_7.3-22 compiler_4.3.3 tools_4.3.3 stars_0.6-6 Rcpp_1.0.13
[17] rlang_1.1.4 units_0.8-5
I will try to create a full reprex using conda...
What is the sessionInfo() on the system where it works?
It works on that system:
> sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: x86_64-redhat-linux-gnu
Running under: Fedora Linux 40 (KDE Plasma)
Matrix products: default
BLAS/LAPACK: FlexiBLAS OPENBLAS-OPENMP; LAPACK version 3.11.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: Europe/Berlin
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices datasets utils methods base
loaded via a namespace (and not attached):
[1] CoprManager_0.5.7 later_1.3.2 R6_2.5.1 httpuv_1.6.15
[5] e1071_1.7-14 magrittr_2.0.3 abind_1.4-5 KernSmooth_2.23-24
[9] parallel_4.4.1 classInt_0.4-10 sf_1.0-16 promises_1.3.0
[13] cli_3.6.3 grid_4.4.1 DBI_1.2.3 proxy_0.4-27
[17] class_7.3-22 compiler_4.4.1 tools_4.4.1 stars_0.6-6
[21] Rcpp_1.0.13 rlang_1.1.4 jsonlite_1.8.8 units_0.8-5
but trying to create a reprex, I notice that it also works on something very close to the problematic machine:
library(stars)
Loading required package: abind
Loading required package: sf
Linking to GEOS 3.12.2, GDAL 3.9.2, PROJ 9.4.1; sf_use_s2() is TRUE
> sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: x86_64-conda-linux-gnu
Running under: Fedora Linux 40 (KDE Plasma)
Matrix products: default
BLAS/LAPACK: /home/courtiol/.conda/envs/stars/lib/libopenblasp-r0.3.27.so; LAPACK version 3.12.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: Europe/Berlin
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] stars_0.6-6 sf_1.0-16 abind_1.4-5
loaded via a namespace (and not attached):
[1] compiler_4.4.1 magrittr_2.0.3 class_7.3-22 parallel_4.4.1
[5] tools_4.4.1 DBI_1.2.3 units_0.8-5 proxy_0.4-27
[9] Rcpp_1.0.13 KernSmooth_2.23-24 grid_4.4.1 e1071_1.7-14
[13] classInt_0.4-10 rlang_1.1.4
so indeed, this does not seem to be something specific to gdal, proj or other spatial libraries...
So all the packages have the same versions with the exception of the base packages (4.3.3 vs 4.4.1) and the geo-libraries are the same too.
I have hard time seeing how base packages could get in the way there, so I am a little baffled.
I will update.packages(checkBuilt = TRUE) to make sure this is not a compilation issue caused during the downgrading of R caused by some outdated binaries on anaconda.
My guess would be that the problem is conda.
Recompiling everything did not help... I will attempt creating a new conda sticking to R 4.4.1.
(out of issue topic: Do you experience issues with conda before? It matters to us since we have a GIS lab with several researchers and we are thinking of making them use conda for all the remote work...)
I see things like this:
> library(xml2)
Warning: program compiled against libxml 210 using older 209
which is caused by conda, putting itself in front on the path
$ which xml2-config
/home/edzer/miniconda3/bin/xml2-config
and then R happily ignoring conda libs during runtime.
I like that my system package manager keeps libraries in sane order, and when I want to do something outside the box I do it in a container. Conda seems to do something half way those two, but only gets in the way when you're not using Python IMO.
I confirm that the following conda recipe reads the NC fine on one machine but not on the other:
conda config --add channels conda-forge
conda create -n test
conda activate test
conda install R r-stars
R --vanilla
stars::read_stars("gfdl-esm4_r1i1p1f1_w5e5_ssp126_tas_lat27.0to72.0lon-13.0to56.0_daily_2015_2020.nc")
So my next step is to install R on the remote machine, to see if that is indeed a conda mysterious issue...
Many thanks @edzer, I am closing the issue for now since I don't think that it is a stars issue per se, but I will document what is going on for others to know if they ever face the same issue.
Ok, so after installing R and everything on the server so as not to use conda, the same problem remained. Conda was thus not the culprit for this particular issue. Instead, I noticed that the problem disappears when the NC file is stored on the computer doing the computation, as opposed to our default, which is to have the data on a data server mounted using samba share. So this is a samba-related issue, which may or may not apply depending on the samba version and settings. If you have the same issue, try putting the data on something not mounted using samba.
Thanks for sharing your diagnose, @courtiol !