stars
stars copied to clipboard
Reading local Zarr files into stars
Hi,
After looking at the vignette for reading Zarr files in stars, I am unsure how to read local Zarr directories into R. I have been trying to work with satellite imagery for the Southern Ocean downloaded from Copernicus' Marine Data Client.
Here is my attempt at coding this
`library(stars)
dsn <- 'ZARR:"sic_daily_samples.zarr/"'
read_mdim(dsn)`
Which gives the error message
Error in CPL_read_mdim(file, array_name, options, offset, count, step, : CHAR() can only be applied to a 'CHARSXP', not a 'NULL' In addition: Warning messages: 1: In CPL_read_mdim(file, array_name, options, offset, count, step, : GDAL Error 1: Decompressor blosc not handled 2: In CPL_read_mdim(file, array_name, options, offset, count, step, : GDAL Error 1: Decompressor blosc not handled 3: In CPL_read_mdim(file, array_name, options, offset, count, step, : GDAL Error 1: Decompressor blosc not handled 4: In CPL_read_mdim(file, array_name, options, offset, count, step, : GDAL Error 1: Decompressor blosc not handled
I've uploaded a subset of the data for ease but I can't figure out how to read it as a zipped or unzipped file, so any help with this would be appreciated!
Thanks, Josh
I get
> read_mdim("sic_daily_sample.zarr/")
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
siconc [1] NA NA NA NaN NA NA 1e+05
dimension(s):
from to refsys point
longitude 1 4320 WGS 84 NA
latitude 1 961 WGS 84 NA
time 1 1 POSIXct TRUE
values x/y
longitude [-180.0417,-179.9583),...,[179.875,179.9583) [x]
latitude [-80.04167,-79.95833),...,[-0.04166667,0.04166667) [y]
time 2021-01-09 UTC
What is your sessionInfo()
and sf_extSoftVersion()
output, after loading stars
?
Thanks Edzer, I tried the same code and got the same error message.
My sessionInfo() gives
R version 4.3.2 (2023-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.utf8 LC_CTYPE=English_United Kingdom.utf8
[3] LC_MONETARY=English_United Kingdom.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.utf8
time zone: Europe/London
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] stars_0.6-4 sf_1.0-14 abind_1.4-5
loaded via a namespace (and not attached):
[1] utf8_1.2.4 R6_2.5.1 tidyselect_1.2.0 e1071_1.7-13 magrittr_2.0.3
[6] glue_1.6.2 tibble_3.2.1 KernSmooth_2.23-22 parallel_4.3.2 pkgconfig_2.0.3
[11] generics_0.1.3 dplyr_1.1.3 lifecycle_1.0.4 classInt_0.4-10 cli_3.6.1
[16] fansi_1.0.5 vctrs_0.6.4 grid_4.3.2 DBI_1.2.1 proxy_0.4-27
[21] class_7.3-22 compiler_4.3.2 rstudioapi_0.15.0 tools_4.3.2 pillar_1.9.0
[26] Rcpp_1.0.11 rlang_1.1.2 units_0.8-4
And my sf_extSoftVersion() prints
GEOS GDAL proj.4 GDAL_with_GEOS USE_PROJ_H PROJ
"3.11.2" "3.7.2" "9.3.0" "true" "true" "9.3.0"
Please update sf
to 1.0-15, and try again.
That still printed the same error message as previously. I haven't yet downloaded the latest version of RStudio but I don't imagine that would cause this error?
See also https://github.com/r-spatial/stars/issues/566#issuecomment-1261880743
Apologies, I'm not yet proficient with R. How do I install that patch? I tried using remotes::install_github("rspatial/sf") but I'm still seeing the same error code.
No need for you to install that patch.
Sorry I'm a bit lost as to what steps I can take from the other issue to fix my issue.
I'm just cross linking them; I can reproduce the error on GitHub actions here: https://github.com/r-spatial/stars/actions/runs/7712573313/job/21020420577#step:6:297
@oshuwilson,
It seems that this issue is specific to the Windows binary release. Note that you can use CopernicusMarine for subsetting Copernicus Marine data as well. However, it does not yet support ZARR data because of the issue reported here and https://github.com/r-spatial/stars/issues/566#issuecomment-1261880743
Thanks @pepijn-devries - I'll look at doing that to download as a netCDF if the Zarr format remains unusable for my setup. My main issue is that the full data I need is massive (~1.3TB as a netCDF but only ~250GB as Zarr), so Zarr would be preferable if it can work! But if not, I'll get a new hard drive and put my computer to the test.
It seems that this issue is specific to the Windows binary release.
Windows and MacOS binary releases; we added blosc, at least to windows binary builds, but this suggests it's not working.
Hi @edzer,
Is there any news on the Windows build and blosc decompression of ZARR files? Thanks for your work on the package!
By the way, I did some additional testing. The issue does not only occur on Windows, but also on a Linux Fedora (virtual) machine I have set up:
library(stars)
#> Loading required package: abind
#> Loading required package: sf
#> Linking to GEOS 3.12.1, GDAL 3.7.3, PROJ 9.2.1; sf_use_s2() is TRUE
dsn <- 'ZARR:"/vsicurl/https://ncsa.osn.xsede.org/Pangeo/pangeo-forge/gpcp-feedstock/gpcp.zarr"'
bounds <- c(longitude = "lon_bounds", latitude = "lat_bounds")
r <- read_mdim(dsn, bounds = bounds)
#> Warning in CPL_read_mdim(file, array_name, options, offset, count, step, : GDAL
#> Error 1: Decompressor blosc not handled
#> Warning in CPL_read_mdim(file, array_name, options, offset, count, step, : GDAL
#> Error 1: Decompressor blosc not handled
#> Warning in CPL_read_mdim(file, array_name, options, offset, count, step, : GDAL
#> Error 1: Decompressor blosc not handled
#> Warning in CPL_read_mdim(file, array_name, options, offset, count, step, : GDAL
#> Error 1: Decompressor blosc not handled
#> Warning in CPL_read_mdim(file, array_name, options, offset, count, step, : GDAL
#> Error 1: Decompressor blosc not handled
#> Warning in CPL_read_mdim(file, array_name, options, offset, count, step, : GDAL
#> Error 1: Decompressor blosc not handled
#> Warning in CPL_read_mdim(file, array_name, options, offset, count, step, : GDAL
#> Error 1: Decompressor blosc not handled
#> Error in CPL_read_mdim(file, array_name, options, offset, count, step, : CHAR() can only be applied to a 'CHARSXP', not a 'NULL'
Created on 2024-03-11 with reprex v2.1.0
With sessionInfo()
:
R version 4.3.2 (2023-10-31)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Fedora Linux 39 (Workstation Edition)
Matrix products: default
BLAS/LAPACK: FlexiBLAS OPENBLAS-OPENMP; LAPACK version 3.11.0
locale:
[1] LC_CTYPE=nl_NL.UTF-8 LC_NUMERIC=C LC_TIME=nl_NL.UTF-8 LC_COLLATE=nl_NL.UTF-8
[5] LC_MONETARY=nl_NL.UTF-8 LC_MESSAGES=nl_NL.UTF-8 LC_PAPER=nl_NL.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C
time zone: Europe/Amsterdam
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] gtable_0.3.4 dplyr_1.1.4 compiler_4.3.2 tidyselect_1.2.0 reprex_2.1.0 Rcpp_1.0.12
[7] clipr_0.8.0 callr_3.7.5 scales_1.3.0 yaml_2.3.8 fastmap_1.1.1 ggplot2_3.5.0
[13] R6_2.5.1 generics_0.1.3 classInt_0.4-10 sf_1.0-15 knitr_1.45 tibble_3.2.1
[19] units_0.8-5 munsell_0.5.0 DBI_1.2.2 pillar_1.9.0 rlang_1.1.3 utf8_1.2.4
[25] xfun_0.42 fs_1.6.3 cli_3.6.2 withr_3.0.0 magrittr_2.0.3 ps_1.7.6
[31] class_7.3-22 processx_3.8.3 digest_0.6.34 grid_4.3.2 rstudioapi_0.15.0 lifecycle_1.0.4
[37] vctrs_0.6.5 KernSmooth_2.23-22 proxy_0.4-27 evaluate_0.23 glue_1.7.0 fansi_1.0.6
[43] e1071_1.7-14 colorspace_2.1-0 rmarkdown_2.26 tools_4.3.2 pkgconfig_2.0.3 htmltools_0.5.7
Same here, using MacOS.
library(stars)
> dsn = 'ZARR:"/vsicurl/https://storage.googleapis.com/cmip6/CMIP6/HighResMIP/CMCC/CMCC-CM2-HR4/highresSST-present/r1i1p1f1/6hrPlev/psl/gn/v20170706"/'
> gdal_utils("info", dsn)
Warning messages:
1: In CPL_gdalinfo(if (missing(source)) character(0) else source, options, :
GDAL Error 1: Decompressor blosc not handled
2: In CPL_gdalinfo(if (missing(source)) character(0) else source, options, :
GDAL Error 1: Decompressor blosc not handled
3: In CPL_gdalinfo(if (missing(source)) character(0) else source, options, :
GDAL Error 1: Decompressor blosc not handled
4: In CPL_gdalinfo(if (missing(source)) character(0) else source, options, :
GDAL Error 1: Decompressor blosc not handled
5: In CPL_gdalinfo(if (missing(source)) character(0) else source, options, :
GDAL Error 1: Decompressor blosc not handled
6: In CPL_gdalinfo(if (missing(source)) character(0) else source, options, :
GDAL Error 1: Decompressor blosc not handled
7: In CPL_gdalinfo(if (missing(source)) character(0) else source, options, :
GDAL Error 1: Decompressor blosc not handled
With sessionInfo():
R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.5.1
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Europe/Berlin
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] sf_1.0-16
loaded via a namespace (and not attached):
[1] compiler_4.3.1 magrittr_2.0.3 class_7.3-22 DBI_1.2.3 tools_4.3.1 units_0.8-5 proxy_0.4-27 rstudioapi_0.16.0 Rcpp_1.0.13 KernSmooth_2.23-24 grid_4.3.1 e1071_1.7-14 classInt_0.4-10