qs icon indicating copy to clipboard operation
qs copied to clipboard

Error in qdeserialize(x) : Endian of system doesn't match file endian

Open barracuda156 opened this issue 1 year ago • 15 comments


R version 4.2.3 (2023-03-15) -- "Shortstop Beagle"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: powerpc-apple-darwin10.8.0 (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> total_time <- Sys.time()
> 
> suppressMessages(library(Rcpp))
> suppressMessages(library(dplyr))
> suppressMessages(library(data.table))
> suppressMessages(library(qs))
> suppressMessages(library(stringfish))
> options(warn = 1)
> 
> do_gc <- function() {
+   if (utils::compareVersion(as.character(getRversion()), "3.5.0") != -1) {
+     gc(full = TRUE)
+   } else {
+     gc()
+   }
+ }
> 
> # because sourceCpp uses setwd, we need absolute path to R_TESTS when run within R CMD check
> R_TESTS <- Sys.getenv("R_TESTS") # startup.Rs
> if (nzchar(R_TESTS)) {
+   R_TESTS_absolute <- normalizePath(R_TESTS)
+   Sys.setenv(R_TESTS = R_TESTS_absolute)
+ }
> sourceCpp(code = decode_source(
+ c("un]'BAAA@QRtHACAAAAAAA+>nAAAv7#aT)JXC:JAR%*QaAh72AB'B'vw5pac6M<xR5V+cWn+KxIBy6|r,OVt?2~X%:xAw/,f}d^_#|XKWFvW%N#TD'H'$}!eH:<{E(H&Yk90NjkdSLMP5[S$2_W,xfO(ao}fQ+jw",
+   "Q{>6_%ygB8MFP)gz)^m++prny$p$2zd4,TjRyD]#^IDs$AEA.Iln5o|!b6Rg,?H[7:4>fVhjk;Elgs[t~/2QV.smWKr)qciq:,gJ.WM#<7X[GTC*H}p8LL/GQv]6d>R=O>iPUN11/~8!@P^g#xecEHjR>JF<,zuB",
+   "8d@Aq1w1Wu;h`BaHYM2BlL6'_X((9Fn4,ns<9^5xcw[_.)4nTTMPw~^2pcKT)+g&])=3]x2;(q7gVbF5qI7RS.hY;}@^Pu~Qxr5/V!#B6}G{Csfkb&I^Xe;hLkO}dX;5`'Wd8?BvZ*@laa2qbX<XE_{|7H*;869]",
+   "zXa+QU~nU3~Xan{Pt5:LtE;TJ=^8_jDXcl#X:u)M`h&a&t&':CQ0!0atQoDNsGfRotbL2BvG&7;TM<uKn>{L%h{E2WwF+}2aDp01lLf&+8HLAbetZ_hlWHeGgi|Xl.U@;O~RhGYsXC1e}#R]e=ky)D<SpP+)~|XO",
+   "TYww=2?PA~!09BKVaX]Kr1Xt[O&{gzkTc9KbV=<uAA+ivS![q)L4F#n5'*XTy2YPl?+(1Szz:4klMBs?9Bk9!wKDZV'mx*Qb#CLRs6Sd1[5HYHk;:H2d{CZt|=iTU2EwD&=pD(:wGGm_$H$WNFG'g9aOTl4q^IQd",
+   "KCA4q>Z>Lku@C8Iy")))
Error in qdeserialize(x) : Endian of system doesn't match file endian
Calls: sourceCpp -> writeLines -> decode_source -> qdeserialize
Execution halted

Is it a test problem or Big-endian is not supported in the code itself?

barracuda156 avatar Mar 20 '23 19:03 barracuda156

Which system uses big endian? Solaris?

I tried to make code for endianness as general as possible, but I haven't set up any system for it (and neither does CRAN any more) so possible its a bug.

If you could help figure out the easiest way to set up a test environment I would like to look into it.

traversc avatar Mar 20 '23 20:03 traversc

@traversc Thank you for responding!

Which system uses big endian? Solaris?

In my case macOS PPC, but there are quite a number of Big-endian systems, including several *BSD and some Linux.

I tried to make code for endianness as general as possible, but I haven't set up any system for it (and neither does CRAN any more) so possible its a bug.

It seems to be this error message: https://github.com/traversc/qs/blob/cd68e2acb4f0f7960e02f781bd66d56b560e722f/src/qs_common.h#L755

What I am not sure is whether it is the test code which is little-endian-only, or something actually fails to work with qs itself.

If you could help figure out the easiest way to set up a test environment I would like to look into it.

I can test anything on my end, if that can be helpful. Setting up test env without PowerPC hardware may not be very easy. (Well, you could use Rosetta on 10.6.8 Server in VM, that is what I do when I am away from native hardware, but I do not expect anything to bother with that just for a single fix.)

barracuda156 avatar Mar 20 '23 20:03 barracuda156

MacOS should have __BIG_ENDIAN__ defined, I believe: https://opensource.apple.com/source/xnu/xnu-1228/libkern/libkern/OSByteOrder.h.auto.html Which is what is used here: https://github.com/traversc/qs/blob/master/src/xxhash/xxhash.h#L1067

Alternatively __POWERPC__ can be used, which is always Big-endian.

barracuda156 avatar Mar 20 '23 20:03 barracuda156

@traversc My thought was that a sample passed in a test file is endian-specific, and that triggered the error. If it is not the case, then endianness handling goes wrong somewhere and defines are to be fixed.

barracuda156 avatar Mar 21 '23 06:03 barracuda156

Ah, you could be right that it's the test where the issue is, not the code.

Does the package work otherwise outside of the test? The last time I checked solaris it was working, but it's been a while.

traversc avatar Mar 22 '23 16:03 traversc

Ah, you could be right that it's the test where the issue is, not the code. Does the package work otherwise outside of the test? The last time I checked solaris it was working, but it's been a while.

@traversc Is there some external test case that I can use to check it?

barracuda156 avatar Mar 22 '23 21:03 barracuda156

Could you try running this modified test script?

https://gist.github.com/traversc/55b7a8b9b060eb2df0829f170365dc41

traversc avatar Mar 26 '23 04:03 traversc

@traversc The first test passes (for which I replaced the script; but perhaps it passed originally as well?):

R version 4.2.3 (2023-03-15) -- "Shortstop Beagle"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: powerpc-apple-darwin10.8.0 (32-bit)

> total_time <- Sys.time()
> 
> suppressMessages(library(Rcpp))
> suppressMessages(library(dplyr))
> suppressMessages(library(data.table))
> suppressMessages(library(qs))
> suppressMessages(library(stringfish))
> options(warn = 1)
> 
> do_gc <- function() {
+   if (utils::compareVersion(as.character(getRversion()), "3.5.0") != -1) {
+     gc(full = TRUE)
+   } else {
+     gc()
+   }
+ }
> 
> # because sourceCpp uses setwd, we need absolute path to R_TESTS when run within R CMD check
> R_TESTS <- Sys.getenv("R_TESTS") # startup.Rs
> if (nzchar(R_TESTS)) {
+   R_TESTS_absolute <- normalizePath(R_TESTS)
+   Sys.setenv(R_TESTS = R_TESTS_absolute)
+ }
> sourceCpp(code="#include <Rcpp.h>
+ using namespace Rcpp;
+ // [[Rcpp::plugins(cpp11)]]
+ // [[Rcpp::export(rng=false)]]
+ CharacterVector splitstr(std::string x, std::vector<double> cuts){
+   CharacterVector ret(cuts.size() - 1);
+   for(uint64_t i=1; i<cuts.size(); i++) {
+     ret[i-1] = x.substr(std::round(cuts[i-1])-1, std::round(cuts[i])-std::round(cuts[i-1]));
+   }
+   return ret;
+ }
+ // [[Rcpp::export(rng=false)]]
+ int setlev(SEXP x, int i) {
+   return SETLEVELS(x,i);
+ }
+ // [[Rcpp::export(rng=false)]]
+ void setobj(SEXP x, int i) {
+   return SET_OBJECT(x, i);
+ }
+ // [[Rcpp::export(rng=false)]]
+ List generateList(std::vector<int> list_elements){
+   auto randchar = []() -> char
+   {
+     const char charset[] =
+       \"0123456789\"
+       \"ABCDEFGHIJKLMNOPQRSTUVWXYZ\"
+       \"abcdefghijklmnopqrstuvwxyz\";
+     const size_t max_index = (sizeof(charset) - 1);
+     return charset[ rand() % max_index ];
+   };
+   List ret(list_elements.size());
+   std::string str(10,0);
+   for(size_t i=0; i<list_elements.size(); i++) {
+     switch(list_elements[i]) {
+     case 1:
+       ret[i] = R_NilValue;
+       break;
+     case 2:
+       std::generate_n( str.begin(), 10, randchar );
+       ret[i] = str;
+       break;
+     case 3:
+       ret[i] = rand();
+       break;
+     case 4:
+       ret[i] = static_cast<double>(rand());
+       break;
+     }
+   }
+   return ret;
+ }")
> if (nzchar(R_TESTS)) Sys.setenv(R_TESTS = R_TESTS)
> 
> args <- commandArgs(T)
> if (nzchar(R_TESTS) || ((length(args) > 0) && args[1] == "check")) { # do fewer tests within R CMD check so it completes within a reasonable amount of time
+   mode <- "filestream"
+   reps <- 2
+   test_points <- c(0, 1, 2, 4, 8, 2^5 - 1, 2^5 + 1, 2^5, 2^8 - 1, 2^8 + 1, 2^8, 2^16 - 1, 2^16 + 1, 2^16, 1e6)
+   test_points_slow <- c(0, 1, 2, 4, 8, 2^5 - 1, 2^5 + 1, 2^5, 2^8 - 1, 2^8 + 1, 2^8, 2^16 - 1, 2^16 + 1, 2^16) # for Character Vector, stringfish and list
+   max_size <- 1e6
+ } else {
+   if (length(args) == 0) {
+     mode <- "filestream"
+     reps <- 3
+   } else {
+     mode <- args[1] # fd, memory, HANDLE
+     reps <- as.numeric(args[2])
+   }
+   test_points <- c(0, 1, 2, 4, 8, 2^5 - 1, 2^5 + 1, 2^5, 2^8 - 1, 2^8 + 1, 2^8, 2^16 - 1, 2^16 + 1, 2^16, 1e6, 1e7)
+   test_points_slow <- test_points
+   max_size <- 1e7
+ }
> myfile <- tempfile()
> 
> obj_size <- 0
> get_obj_size <- function() {
+   get("obj_size", envir = globalenv())
+ }
> set_obj_size <- function(x) {
+   assign("obj_size", get_obj_size() + as.numeric(object.size(x)), envir = globalenv())
+   return(get_obj_size());
+ }
> random_object_generator <- function(N, with_envs = FALSE) { # additional input: global obj_size, max_size
+   if (sample(3, 1) == 1) {
+     ret <- as.list(1:N)
+   } else if (sample(2, 1) == 1) {
+     ret <- as.pairlist(1:N)
+   } else {
+     ret <- as.pairlist(1:N)
+     setlev(ret, sample(2L^12L, 1L) - 1L)
+     setobj(ret, 1L)
+   }
+ 
+   for (i in 1:N) {
+     if (get_obj_size() > get("max_size", envir = globalenv())) break;
+     otype <- sample(12, size = 1)
+     z <- NULL
+     is_attribute <- ifelse(i == 1, F, sample(c(F, T), size = 1))
+     if (otype == 1) {z <- rnorm(1e4); set_obj_size(z);}
+     else if (otype == 2) { z <- sample(1e4) - 5e2; set_obj_size(z); }
+     else if (otype == 3) { z <- sample(c(T, F, NA), size = 1e4, replace = T); set_obj_size(z); }
+     else if (otype == 4) { z <- (sample(256, size = 1e4, replace = T) - 1) %>% as.raw; set_obj_size(z); }
+     else if (otype == 5) { z <- replicate(sample(1e4, size = 1), {rep(letters, length.out = sample(10, size = 1)) %>% paste(collapse = "")}); set_obj_size(z); }
+     else if (otype == 6) { z <- rep(letters, length.out = sample(1e4, size = 1)) %>% paste(collapse = ""); set_obj_size(z); }
+     else if (otype == 7) { z <- as.formula("y ~ a + b + c : d", env = globalenv()); attr(z, "blah") <- sample(1e4) - 5e2; set_obj_size(z); }
+     else if (with_envs && otype %in% c(8, 9)) { z <- function(x) {x + runif(1)} }
+     # else if(with_envs && otype %in% c(10,11)) { z <- new.env(); z$x <- random_object_generator(N, with_envs); makeActiveBinding("y", function() runif(1), z) }
+     else { z <- random_object_generator(N, with_envs) }
+     if (is_attribute) {
+       attr(ret[[i - 1]], runif(1) %>% as.character()) <- z
+     } else {
+       ret[[i]] <- z
+     }
+   }
+   return(ret)
+ }
> 
> rand_strings <- function(n) {
+   s <- sample(0:100, size = n, replace = T)
+   x <- lapply(unique(s), function(si) {
+     stringfish::random_strings(sum(s == si), si, vector_mode = "normal")
+   }) %>% unlist %>% sample
+   x[sample(n, size = n/10)] <- NA
+   return(x)
+ }
> 
> nested_tibble <- function() {
+   sub_tibble <- function(nr = 600, nc = 4) {
+     z <- lapply(1:nc, function(i) rand_strings(nr)) %>%
+       setNames(make.unique(paste0(sample(letters, nc), rand_strings(nc)))) %>%
+       bind_cols %>%
+       as_tibble
+   }
+   tibble(
+     col1 = rand_strings(100),
+     col2 = rand_strings(100),
+     col3 = lapply(1:100, function(i) sub_tibble(nr = 600, nc = 4)),
+     col4 = lapply(1:100, function(i) sub_tibble(nr = 600, nc = 4)),
+     col5 = lapply(1:100, function(i) sub_tibble(nr = 600, nc = 4))
+   ) %>% setNames(make.unique(paste0(sample(letters, 5), rand_strings(5))))
+ }
> 
> printCarriage <- function(x) {
+   cat(x, "\r")
+ }
> 
> serialize_identical <- function(x1, x2) {
+   identical(serialize(x1, NULL), serialize(x2, NULL))
+ }
> 
> ################################################################################################
> 
> qsave_rand <- function(x, file) {
+   alg <- sample(c("lz4", "zstd", "lz4hc", "zstd_stream", "uncompressed"), 1)
+   # alg <- "zstd_stream"
+   nt <- sample(5,1)
+   sc <- sample(0:15,1)
+   cl <- sample(10,1)
+   ch <- sample(c(T,F),1)
+   if (mode == "filestream") {
+     qsave(x, file = file, preset = "custom", algorithm = alg,
+         compress_level = cl, shuffle_control = sc, nthreads = nt, check_hash = ch)
+   } else if (mode == "fd") {
+     fd <- qs:::openFd(myfile, "w")
+     qsave_fd(x, fd, preset = "custom", algorithm = alg,
+           compress_level = cl, shuffle_control = sc, check_hash = ch)
+     qs:::closeFd(fd)
+   } else if (mode == "handle") {
+     h <- qs:::openHandle(myfile, "w")
+     qsave_handle(x, h, preset = "custom", algorithm = alg,
+              compress_level = cl, shuffle_control = sc, check_hash = ch)
+     qs:::closeHandle(h)
+   } else if (mode == "memory") {
+     .sobj <<- qserialize(x, preset = "custom", algorithm = alg,
+                          compress_level = cl, shuffle_control = sc, check_hash = ch)
+   } else {
+     stop(paste0("wrong write-mode selected: ", mode))
+   }
+ }
> 
> qread_rand <- function(file) {
+   ar <- sample(c(T,F),1)
+   nt <- sample(5,1)
+   if (mode == "filestream") {
+     x <- qread(file, use_alt_rep = ar, nthreads = nt, strict = T)
+   } else if (mode == "fd") {
+     if (sample(2,1) == 1) {
+       fd <- qs:::openFd(myfile, "r")
+       x <- qread_fd(fd, use_alt_rep = ar, strict = T)
+       qs:::closeFd(fd)
+     } else {
+       x <- qread(file, use_alt_rep = ar, nthreads = nt, strict = T)
+     }
+   } else if (mode == "handle") {
+     if (sample(2,1) == 1) {
+       h <- qs:::openHandle(myfile, "r")
+       x <- qread_handle(h, use_alt_rep = ar, strict = T)
+       qs:::closeHandle(h)
+     } else {
+       x <- qread(file, use_alt_rep = ar, nthreads = nt, strict = T)
+     }
+   } else if (mode == "memory") {
+     x <- qdeserialize(.sobj, use_alt_rep = ar, strict = T)
+   } else {
+     stop(paste0("wrong read-mode selected: ", mode))
+   }
+   return(x)
+ }
> 
> ################################################################################################
> 
> for (q in 1:reps) {
+   cat("Rep",  q, "of", reps, "\n")
+ 
+   # String correctness
+   time <- vector("numeric", length = 3)
+   for (tp in test_points) {
+     for (i in 1:3) {
+       x1 <- rep(letters, length.out = tp) %>% paste(collapse = "")
+       x1 <- c(NA, "", x1)
+       time[i] <- Sys.time()
+       qsave_rand(x1, file = myfile)
+       z <- qread_rand(file = myfile)
+       time[i] <- Sys.time() - time[i]
+       do_gc()
+       stopifnot(identical(z, x1))
+     }
+     printCarriage(sprintf("strings: %s, %s s",tp, signif(mean(time), 4)))
+   }
+   cat("\n")
+ 
+   # Character vectors
+   time <- vector("numeric", length = 3)
+   for (tp in test_points_slow) {
+     for (i in 1:3) {
+       # qs_use_alt_rep(F)
+       x1 <- rep(as.raw(sample(255)), length.out = tp*10) %>% rawToChar
+       cuts <- sample(tp*10, tp + 1) %>% sort %>% as.numeric
+       x1 <- splitstr(x1, cuts)
+       x1 <- c(NA, "", x1)
+       qsave_rand(x1, file = myfile)
+       time[i] <- Sys.time()
+       z <- qread_rand(file = myfile)
+       time[i] <- Sys.time() - time[i]
+       do_gc()
+       stopifnot(identical(z, x1))
+     }
+     printCarriage(sprintf("Character Vectors: %s, %s s",tp, signif(mean(time), 4)))
+   }
+   cat("\n")
+ 
+   # stringfish character vectors -- require R > 3.5.0
+   if (utils::compareVersion(as.character(getRversion()), "3.5.0") != -1) {
+     time <- vector("numeric", length = 3)
+     for (tp in test_points_slow) {
+       for (i in 1:3) {
+         x1 <- rep(as.raw(sample(255)), length.out = tp*10) %>% rawToChar
+         cuts <- sample(tp*10, tp + 1) %>% sort %>% as.numeric
+         x1 <- splitstr(x1, cuts)
+         x1 <- c(NA, "", x1)
+         x1 <- stringfish::convert_to_sf(x1)
+         qsave_rand(x1, file = myfile)
+         time[i] <- Sys.time()
+         z <- qread_rand(file = myfile)
+         time[i] <- Sys.time() - time[i]
+         do_gc()
+         stopifnot(identical(z, x1))
+       }
+       printCarriage(sprintf("Stringfish: %s, %s s",tp, signif(mean(time), 4)))
+     }
+     cat("\n")
+   }
+ 
+   # Integers
+   time <- vector("numeric", length = 3)
+   for (tp in test_points) {
+     for (i in 1:3) {
+       x1 <- sample(1:tp, replace = T)
+       x1 <- c(NA, x1)
+       time[i] <- Sys.time()
+       qsave_rand(x1, file = myfile)
+       z <- qread_rand(file = myfile)
+       time[i] <- Sys.time() - time[i]
+       do_gc()
+       stopifnot(identical(z, x1))
+     }
+     printCarriage(sprintf("Integers: %s, %s s",tp, signif(mean(time), 4)))
+   }
+   cat("\n")
+ 
+   # Doubles
+   time <- vector("numeric", length = 3)
+   for (tp in test_points) {
+     for (i in 1:3) {
+       x1 <- rnorm(tp)
+       x1 <- c(NA, x1)
+       time[i] <- Sys.time()
+       qsave_rand(x1, file = myfile)
+       z <- qread_rand(file = myfile)
+       time[i] <- Sys.time() - time[i]
+       do_gc()
+       stopifnot(identical(z, x1))
+     }
+     printCarriage(sprintf("Numeric: %s, %s s",tp, signif(mean(time), 4)))
+   }
+   cat("\n")
+ 
+   # Logical
+   time <- vector("numeric", length = 3)
+   for (tp in test_points) {
+     for (i in 1:3) {
+ 
+       x1 <- sample(c(T, F, NA), replace = T, size = tp)
+       time[i] <- Sys.time()
+       qsave_rand(x1, file = myfile)
+       z <- qread_rand(file = myfile)
+       time[i] <- Sys.time() - time[i]
+       do_gc()
+       stopifnot(identical(z, x1))
+     }
+     printCarriage(sprintf("Logical: %s, %s s",tp, signif(mean(time),4)))
+   }
+   cat("\n")
+ 
+   # List
+   time <- vector("numeric", length = 3)
+   for (tp in test_points_slow) {
+     for (i in 1:3) {
+       x1 <- generateList(sample(1:4, replace = T, size = tp))
+       time[i] <- Sys.time()
+       qsave_rand(x1, file = myfile)
+       z <- qread_rand(file = myfile)
+       time[i] <- Sys.time() - time[i]
+       do_gc()
+       stopifnot(identical(z, x1))
+     }
+     printCarriage(sprintf("List: %s, %s s",tp, signif(mean(time),4)))
+   }
+   cat("\n")
+ 
+   for (i in 1:3) {
+     x1 <- rep( replicate(1000, { rep(letters, length.out = 2^7 + sample(10, size = 1)) %>% paste(collapse = "") }), length.out = 1e6 )
+     x1 <- data.frame(str = x1,num = runif(1:1000), stringsAsFactors = F)
+     qsave_rand(x1, file = myfile)
+     z <- qread_rand(file = myfile)
+     do_gc()
+     stopifnot(identical(z, x1))
+   }
+   cat("Data.frame test")
+   cat("\n")
+ 
+   for (i in 1:3) {
+     x1 <- rep( replicate(1000, { rep(letters, length.out = 2^7 + sample(10, size = 1)) %>% paste(collapse = "") }), length.out = 1e6 )
+     x1 <- data.table(str = x1,num = runif(1:1e6))
+     qsave_rand(x1, file = myfile)
+     z <- qread_rand(file = myfile)
+     do_gc()
+     stopifnot(all(z == x1))
+   }
+   cat("Data.table test")
+   cat("\n")
+ 
+   for (i in 1:3) {
+ 
+     x1 <- rep( replicate(1000, { rep(letters, length.out = 2^7 + sample(10, size = 1)) %>% paste(collapse = "") }), length.out = 1e6 )
+     x1 <- tibble(str = x1,num = runif(1:1e6))
+     qsave_rand(x1, file = myfile)
+     z <- qread_rand(file = myfile)
+     do_gc()
+     stopifnot(identical(z, x1))
+   }
+   cat("Tibble test")
+   cat("\n")
+ 
+   # Encoding test
+   if (Sys.info()[['sysname']] != "Windows") {
+     for (i in 1:3) {
+ 
+       x1 <- "己所不欲,勿施于人" # utf 8
+       x2 <- x1
+       Encoding(x2) <- "latin1"
+       x3 <- x1
+       Encoding(x3) <- "bytes"
+       x4 <- rep(x1, x2, length.out = 1e4) %>% paste(collapse = ";")
+       x1 <- c(x1, x2, x3, x4)
+       qsave_rand(x1, file = myfile)
+       z <- qread_rand(file = myfile)
+       do_gc()
+       stopifnot(identical(z, x1))
+     }
+     printCarriage("Encoding test")
+   } else {
+     printCarriage("(Encoding test not run on windows)")
+   }
+   cat("\n")
+ 
+   # complex vectors
+   time <- vector("numeric", length = 3)
+   for (tp in test_points) {
+     for (i in 1:3) {
+ 
+       re <- rnorm(tp)
+       im <- runif(tp)
+       x1 <- complex(real = re, imaginary = im)
+       x1 <- c(NA_complex_, x1)
+       time[i] <- Sys.time()
+       qsave_rand(x1, file = myfile)
+       z <- qread_rand(file = myfile)
+       time[i] <- Sys.time() - time[i]
+       do_gc()
+       stopifnot(identical(z, x1))
+     }
+     printCarriage(sprintf("Complex: %s, %s s",tp, signif(mean(time), 4)))
+   }
+   cat("\n")
+ 
+   # factors
+   for (tp in test_points) {
+     time <- vector("numeric", length = 3)
+     for (i in 1:3) {
+       x1 <- factor(rep(letters, length.out = tp), levels = sample(letters), ordered = TRUE)
+       time[i] <- Sys.time()
+       qsave_rand(x1, file = myfile)
+       z <- qread_rand(file = myfile)
+       time[i] <- Sys.time() - time[i]
+       do_gc()
+       stopifnot(identical(z, x1))
+     }
+     printCarriage(sprintf("Factors: %s, %s s",tp, signif(mean(time), 4)))
+   }
+   cat("\n")
+ 
+   # Random objects
+   time <- vector("numeric", length = 8)
+   for (i in 1:8) {
+     # qs_use_alt_rep(sample(c(T, F), size = 1))
+     obj_size <- 0
+     x1 <- random_object_generator(12)
+     printCarriage(sprintf("Random objects: %s bytes", object.size(x1) %>% as.numeric))
+     time[i] <- Sys.time()
+     qsave_rand(x1, file = myfile)
+     z <- qread_rand(file = myfile)
+     time[i] <- Sys.time() - time[i]
+     do_gc()
+     stopifnot(identical(z, x1))
+   }
+   printCarriage(sprintf("Random objects: %s s", signif(mean(time), 4)))
+   cat("\n")
+ 
+   # nested attributes
+   time <- vector("numeric", length = 3)
+   for (i in 1:3) {
+     x1 <- as.list(1:26)
+     attr(x1[[26]], letters[26]) <- rnorm(100)
+     for (i in 25:1) {
+       attr(x1[[i]], letters[i]) <- x1[[i + 1]]
+     }
+     time[i] <- Sys.time()
+     qsave_rand(x1, file = myfile)
+     z <- qread_rand(file = myfile)
+     time[i] <- Sys.time() - time[i]
+     do_gc()
+     stopifnot(identical(z, x1))
+   }
+   printCarriage(sprintf("Nested attributes: %s s", signif(mean(time), 4)))
+   cat("\n")
+ 
+   # alt-rep -- should serialize the unpacked object
+   time <- vector("numeric", length = 3)
+   for (i in 1:3) {
+     x1 <- 1:max_size
+     time[i] <- Sys.time()
+     qsave_rand(x1, file = myfile)
+     z <- qread_rand(file = myfile)
+     time[i] <- Sys.time() - time[i]
+     do_gc()
+     stopifnot(identical(z, x1))
+   }
+   printCarriage(sprintf("Alt rep integer: %s s", signif(mean(time), 4)))
+   cat("\n")
+ 
+ 
+   # Environment test
+   time <- vector("numeric", length = 3)
+   for (i in 1:3) {
+     x1 <- new.env()
+     x1[["a"]] <- 1:max_size
+     x1[["b"]] <- runif(max_size)
+     x1[["c"]] <- stringfish::random_strings(1e4, vector_mode = "normal")
+     time[i] <- Sys.time()
+     qsave_rand(x1, file = myfile)
+     z <- qread_rand(file = myfile)
+     stopifnot(identical(z[["a"]], x1[["a"]]))
+     stopifnot(identical(z[["b"]], x1[["b"]]))
+     stopifnot(identical(z[["c"]], x1[["c"]]))
+     time[i] <- Sys.time() - time[i]
+     do_gc()
+   }
+   printCarriage(sprintf("Environment test: %s s", signif(mean(time), 4)))
+   cat("\n")
+ 
+   time <- vector("numeric", length = 3)
+   for (i in 1:3) {
+     x1 <- nested_tibble()
+     time[i] <- Sys.time()
+     qsave_rand(x1, file = myfile)
+     z <- qread_rand(file = myfile)
+     stopifnot(identical(z, x1))
+     time[i] <- Sys.time() - time[i]
+     do_gc()
+   }
+   printCarriage(sprintf("nested tibble test: %s s", signif(mean(time), 4)))
+   cat("\n")
+ }
Rep 1 of 2 
strings: 0, 0.05226 s 
strings: 1, 0.00779 s 
strings: 2, 0.005751 s 
strings: 4, 0.007838 s 
strings: 8, 0.002318 s 
strings: 31, 0.007268 s 
strings: 33, 0.003209 s 
strings: 32, 0.004083 s 
strings: 255, 0.004804 s 
strings: 257, 0.00559 s 
strings: 256, 0.005134 s 
strings: 65535, 0.006118 s 
strings: 65537, 0.006492 s 
strings: 65536, 0.00536 s 
strings: 1e+06, 0.02081 s 

Character Vectors: 0, 0.001144 s 
Character Vectors: 1, 0.00108 s 
Character Vectors: 2, 0.00198 s 
Character Vectors: 4, 0.002484 s 
Character Vectors: 8, 0.0008907 s 
Character Vectors: 31, 0.002315 s 
Character Vectors: 33, 0.001102 s 
Character Vectors: 32, 0.002196 s 
Character Vectors: 255, 0.001686 s 
Character Vectors: 257, 0.001455 s 
Character Vectors: 256, 0.001321 s 
Character Vectors: 65535, 0.04348 s 
Character Vectors: 65537, 0.03008 s 
Character Vectors: 65536, 0.03206 s 

Stringfish: 0, 0.002423 s 
Stringfish: 1, 0.002013 s 
Stringfish: 2, 0.001189 s 
Stringfish: 4, 0.002247 s 
Stringfish: 8, 0.0009296 s 
Stringfish: 31, 0.002941 s 
Stringfish: 33, 0.000605 s 
Stringfish: 32, 0.001906 s 
Stringfish: 255, 0.000702 s 
Stringfish: 257, 0.0006309 s 
Stringfish: 256, 0.001172 s 
Stringfish: 65535, 0.03668 s 
Stringfish: 65537, 0.04875 s 
Stringfish: 65536, 0.02923 s 

Integers: 0, 0.005715 s 
Integers: 1, 0.004507 s 
Integers: 2, 0.002275 s 
Integers: 4, 0.003704 s 
Integers: 8, 0.003843 s 
Integers: 31, 0.00252 s 
Integers: 33, 0.004282 s 
Integers: 32, 0.004069 s 
Integers: 255, 0.002895 s 
Integers: 257, 0.002519 s 
Integers: 256, 0.004082 s 
Integers: 65535, 0.01399 s 
Integers: 65537, 0.01692 s 
Integers: 65536, 0.01371 s 
Integers: 1e+06, 0.03322 s 

Numeric: 0, 0.005462 s 
Numeric: 1, 0.003754 s 
Numeric: 2, 0.004165 s 
Numeric: 4, 0.005307 s 
Numeric: 8, 0.006371 s 
Numeric: 31, 0.005763 s 
Numeric: 33, 0.006194 s 
Numeric: 32, 0.006085 s 
Numeric: 255, 0.003792 s 
Numeric: 257, 0.004067 s 
Numeric: 256, 0.003875 s 
Numeric: 65535, 0.00904 s 
Numeric: 65537, 0.007324 s 
Numeric: 65536, 0.04577 s 
Numeric: 1e+06, 0.3297 s 

Logical: 0, 0.003996 s 
Logical: 1, 0.002107 s 
Logical: 2, 0.003113 s 
Logical: 4, 0.002708 s 
Logical: 8, 0.004281 s 
Logical: 31, 0.008489 s 
Logical: 33, 0.002771 s 
Logical: 32, 0.00435 s 
Logical: 255, 0.003911 s 
Logical: 257, 0.004796 s 
Logical: 256, 0.004092 s 
Logical: 65535, 0.02198 s 
Logical: 65537, 0.03205 s 
Logical: 65536, 0.01034 s 
Logical: 1e+06, 0.1358 s 

List: 0, 0.005204 s 
List: 1, 0.004667 s 
List: 2, 0.001877 s 
List: 4, 0.003154 s 
List: 8, 0.004232 s 
List: 31, 0.004918 s 
List: 33, 0.004681 s 
List: 32, 0.005295 s 
List: 255, 0.004656 s 
List: 257, 0.004457 s 
List: 256, 0.002855 s 
List: 65535, 0.1114 s 
List: 65537, 0.1105 s 
List: 65536, 0.09607 s 

Data.frame test
Data.table test
Tibble test
Encoding test 

Complex: 0, 0.003167 s 
Complex: 1, 0.001717 s 
Complex: 2, 0.003108 s 
Complex: 4, 0.004002 s 
Complex: 8, 0.007101 s 
Complex: 31, 0.003483 s 
Complex: 33, 0.003996 s 
Complex: 32, 0.002762 s 
Complex: 255, 0.006385 s 
Complex: 257, 0.001923 s 
Complex: 256, 0.004081 s 
Complex: 65535, 0.02265 s 
Complex: 65537, 0.04487 s 
Complex: 65536, 0.03112 s 
Complex: 1e+06, 1.103 s 

Factors: 0, 0.004305 s 
Factors: 1, 0.003119 s 
Factors: 2, 0.005659 s 
Factors: 4, 0.01875 s 
Factors: 8, 0.003422 s 
Factors: 31, 0.004929 s 
Factors: 33, 0.006879 s 
Factors: 32, 0.008085 s 
Factors: 255, 0.002431 s 
Factors: 257, 0.00355 s 
Factors: 256, 0.002194 s 
Factors: 65535, 0.008465 s 
Factors: 65537, 0.007339 s 
Factors: 65536, 0.007729 s 
Factors: 1e+06, 0.03529 s 

Random objects: 1090484 bytes 
Random objects: 1018404 bytes 
Random objects: 1052516 bytes 
Random objects: 1034444 bytes 
Random objects: 1024532 bytes 
Random objects: 1038040 bytes 
Random objects: 1055520 bytes 
Random objects: 1095560 bytes 
Random objects: 0.06615 s 

Nested attributes: 0.002346 s 

Alt rep integer: 0.1815 s 

Environment test: 1.11 s 

nested tibble test: 3.819 s 

Rep 2 of 2 
strings: 0, 0.006519 s 
strings: 1, 0.005103 s 
strings: 2, 0.003132 s 
strings: 4, 0.002589 s 
strings: 8, 0.001922 s 
strings: 31, 0.005121 s 
strings: 33, 0.00762 s 
strings: 32, 0.002416 s 
strings: 255, 0.001947 s 
strings: 257, 0.002393 s 
strings: 256, 0.005304 s 
strings: 65535, 0.005741 s 
strings: 65537, 0.003835 s 
strings: 65536, 0.003255 s 
strings: 1e+06, 0.007774 s 

Character Vectors: 0, 0.001629 s 
Character Vectors: 1, 0.002239 s 
Character Vectors: 2, 0.0008825 s 
Character Vectors: 4, 0.002845 s 
Character Vectors: 8, 0.001425 s 
Character Vectors: 31, 0.0007193 s 
Character Vectors: 33, 0.001587 s 
Character Vectors: 32, 0.001855 s 
Character Vectors: 255, 0.002968 s 
Character Vectors: 257, 0.00235 s 
Character Vectors: 256, 0.001539 s 
Character Vectors: 65535, 0.03014 s 
Character Vectors: 65537, 0.03355 s 
Character Vectors: 65536, 0.0286 s 

Stringfish: 0, 0.002779 s 
Stringfish: 1, 0.000792 s 
Stringfish: 2, 0.000999 s 
Stringfish: 4, 0.0007889 s 
Stringfish: 8, 0.001807 s 
Stringfish: 31, 0.0009585 s 
Stringfish: 33, 0.002786 s 
Stringfish: 32, 0.001583 s 
Stringfish: 255, 0.001337 s 
Stringfish: 257, 0.001438 s 
Stringfish: 256, 0.0009659 s 
Stringfish: 65535, 0.03238 s 
Stringfish: 65537, 0.0341 s 
Stringfish: 65536, 0.03513 s 

Integers: 0, 0.005136 s 
Integers: 1, 0.002703 s 
Integers: 2, 0.006546 s 
Integers: 4, 0.004409 s 
Integers: 8, 0.003787 s 
Integers: 31, 0.002896 s 
Integers: 33, 0.001745 s 
Integers: 32, 0.006448 s 
Integers: 255, 0.002846 s 
Integers: 257, 0.003003 s 
Integers: 256, 0.002579 s 
Integers: 65535, 0.01993 s 
Integers: 65537, 0.008373 s 
Integers: 65536, 0.01883 s 
Integers: 1e+06, 0.05779 s 

Numeric: 0, 0.002293 s 
Numeric: 1, 0.004397 s 
Numeric: 2, 0.006081 s 
Numeric: 4, 0.0027 s 
Numeric: 8, 0.002759 s 
Numeric: 31, 0.007719 s 
Numeric: 33, 0.002966 s 
Numeric: 32, 0.004715 s 
Numeric: 255, 0.01042 s 
Numeric: 257, 0.004352 s 
Numeric: 256, 0.005002 s 
Numeric: 65535, 0.03582 s 
Numeric: 65537, 0.02573 s 
Numeric: 65536, 0.008604 s 
Numeric: 1e+06, 0.4065 s 

Logical: 0, 0.005245 s 
Logical: 1, 0.0033 s 
Logical: 2, 0.003607 s 
Logical: 4, 0.004457 s 
Logical: 8, 0.005503 s 
Logical: 31, 0.002971 s 
Logical: 33, 0.00353 s 
Logical: 32, 0.009703 s 
Logical: 255, 0.1582 s 
Logical: 257, 0.003245 s 
Logical: 256, 0.004522 s 
Logical: 65535, 0.003728 s 
Logical: 65537, 0.007011 s 
Logical: 65536, 0.05354 s 
Logical: 1e+06, 0.05509 s 

List: 0, 0.002102 s 
List: 1, 0.009788 s 
List: 2, 0.004364 s 
List: 4, 0.003085 s 
List: 8, 0.002884 s 
List: 31, 0.004585 s 
List: 33, 0.002863 s 
List: 32, 0.004547 s 
List: 255, 0.00452 s 
List: 257, 0.004032 s 
List: 256, 0.004806 s 
List: 65535, 0.1015 s 
List: 65537, 0.07283 s 
List: 65536, 0.08941 s 

Data.frame test
Data.table test
Tibble test
Encoding test 

Complex: 0, 0.003657 s 
Complex: 1, 0.001906 s 
Complex: 2, 0.004832 s 
Complex: 4, 0.006138 s 
Complex: 8, 0.005333 s 
Complex: 31, 0.004794 s 
Complex: 33, 0.004395 s 
Complex: 32, 0.001775 s 
Complex: 255, 0.002469 s 
Complex: 257, 0.004995 s 
Complex: 256, 0.00609 s 
Complex: 65535, 0.09667 s 
Complex: 65537, 0.03396 s 
Complex: 65536, 0.05017 s 
Complex: 1e+06, 0.1479 s 

Factors: 0, 0.004229 s 
Factors: 1, 0.008636 s 
Factors: 2, 0.003097 s 
Factors: 4, 0.005384 s 
Factors: 8, 0.004585 s 
Factors: 31, 0.003429 s 
Factors: 33, 0.003271 s 
Factors: 32, 0.00457 s 
Factors: 255, 0.004796 s 
Factors: 257, 0.006625 s 
Factors: 256, 0.003838 s 
Factors: 65535, 0.005 s 
Factors: 65537, 0.006286 s 
Factors: 65536, 0.01056 s 
Factors: 1e+06, 0.1789 s 

Random objects: 1065172 bytes 
Random objects: 1087740 bytes 
Random objects: 1021948 bytes 
Random objects: 1025244 bytes 
Random objects: 1019596 bytes 
Random objects: 1050236 bytes 
Random objects: 1070544 bytes 
Random objects: 1021228 bytes 
Random objects: 0.1365 s 

Nested attributes: 0.001955 s 

Alt rep integer: 0.1076 s 

Environment test: 0.4433 s 

nested tibble test: 5.215 s 

> 
> ################################################################################################
> # some one off tests
> 
> # test 1: alt rep implementation
> # https://github.com/traversc/qs/issues/9
> 
> # stringfish character vectors -- require R > 3.5.0
> if (utils::compareVersion(as.character(getRversion()), "3.5.0") != -1) {
+   x <- data.table(x = 1:26, y = letters)
+   qsave(x, file = myfile)
+   xu <- qread(myfile, use_alt_rep = T)
+   data.table::setnames(xu, 1, "a")
+   stopifnot(identical(c("a", "y"), colnames(xu)))
+   data.table::setnames(xu, 2, "b")
+   stopifnot(identical(c("a", "b"), colnames(xu)))
+ }
> 
> cat("tests done\n")
tests done
> rm(list = setdiff(ls(), c("total_time", "do_gc")))
> do_gc()
          used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  644901 19.7    1946142  59.4  2432677  74.3
Vcells 1219409  9.4   18588537 141.9 23235667 177.3
> total_time <- Sys.time() - total_time
> print(total_time)
Time difference of 5.972774 mins
> 
> proc.time()
   user  system elapsed 
436.848  40.159 359.389

Next, second test fails like before (but this one was not modified):

R version 4.2.3 (2023-03-15) -- "Shortstop Beagle"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: powerpc-apple-darwin10.8.0 (32-bit)

> total_time <- Sys.time()
> 
> suppressMessages(library(Rcpp))
> suppressMessages(library(dplyr))
> suppressMessages(library(data.table))
> suppressMessages(library(qs))
> suppressMessages(library(stringfish))
> options(warn = 1)
> 
> do_gc <- function() {
+   if (utils::compareVersion(as.character(getRversion()), "3.5.0") != -1) {
+     gc(full = TRUE)
+   } else {
+     gc()
+   }
+ }
> 
> # because sourceCpp uses setwd, we need absolute path to R_TESTS when run within R CMD check
> R_TESTS <- Sys.getenv("R_TESTS") # startup.Rs
> if (nzchar(R_TESTS)) {
+   R_TESTS_absolute <- normalizePath(R_TESTS)
+   Sys.setenv(R_TESTS = R_TESTS_absolute)
+ }
> sourceCpp(code = decode_source(
+ c("un]'BAAA@QRtHACAAAAAAA+>nAAAv7#aT)JXC:JAR%*QaAh72AB'B'vw5pac6M<xR5V+cWn+KxIBy6|r,OVt?2~X%:xAw/,f}d^_#|XKWFvW%N#TD'H'$}!eH:<{E(H&Yk90NjkdSLMP5[S$2_W,xfO(ao}fQ+jw",
+   "Q{>6_%ygB8MFP)gz)^m++prny$p$2zd4,TjRyD]#^IDs$AEA.Iln5o|!b6Rg,?H[7:4>fVhjk;Elgs[t~/2QV.smWKr)qciq:,gJ.WM#<7X[GTC*H}p8LL/GQv]6d>R=O>iPUN11/~8!@P^g#xecEHjR>JF<,zuB",
+   "8d@Aq1w1Wu;h`BaHYM2BlL6'_X((9Fn4,ns<9^5xcw[_.)4nTTMPw~^2pcKT)+g&])=3]x2;(q7gVbF5qI7RS.hY;}@^Pu~Qxr5/V!#B6}G{Csfkb&I^Xe;hLkO}dX;5`'Wd8?BvZ*@laa2qbX<XE_{|7H*;869]",
+   "zXa+QU~nU3~Xan{Pt5:LtE;TJ=^8_jDXcl#X:u)M`h&a&t&':CQ0!0atQoDNsGfRotbL2BvG&7;TM<uKn>{L%h{E2WwF+}2aDp01lLf&+8HLAbetZ_hlWHeGgi|Xl.U@;O~RhGYsXC1e}#R]e=ky)D<SpP+)~|XO",
+   "TYww=2?PA~!09BKVaX]Kr1Xt[O&{gzkTc9KbV=<uAA+ivS![q)L4F#n5'*XTy2YPl?+(1Szz:4klMBs?9Bk9!wKDZV'mx*Qb#CLRs6Sd1[5HYHk;:H2d{CZt|=iTU2EwD&=pD(:wGGm_$H$WNFG'g9aOTl4q^IQd",
+   "KCA4q>Z>Lku@C8Iy")))
Error in qdeserialize(x) : Endian of system doesn't match file endian
Calls: sourceCpp -> writeLines -> decode_source -> qdeserialize
Execution halted

barracuda156 avatar Mar 26 '23 16:03 barracuda156

The issue was the same for the 1st and 2nd tests, so this is a good sign. Try out the latest github version. I also added in a check for gcc to add the -latomic flag.

traversc avatar Mar 27 '23 07:03 traversc

@traversc Thank you very much! Will test later tonight and update you.

P. S. Referring to https://github.com/traversc/qs/commit/f9a244b459289b7416d0e4c3c4ed05225811a229 -latomic is not conditional on GCC version in fact. It never gets passed by default, and that is intended behavior: GCC devs decided that it is preferable to leave this to configure rather than link to libatomic unconditionally. I use GCC12 myself, and -latomic has to be passed explicitly on 32-bit systems with it.

barracuda156 avatar Mar 27 '23 07:03 barracuda156

I'll take out that comment on GCC version. I'm not sure I understand the conditions when -latomic needs to be passed (On 32-bit windows it's not required). Could you point me to some documentation?

traversc avatar Mar 27 '23 17:03 traversc

@traversc I tried to build from the master. Unfortunately, atomics test does not work:

* installing *source* package ‘qs’ ...
** using staged installation
checking for pkg-config... /opt/local/bin/pkg-config
R CXX compiler: /opt/local/bin/g++-mp-12
zstd 1.5.4 library detected -- skipping zstd compilation
lz4 1.9.4 library detected -- skipping lz4 compilation
configure: creating ./config.status
config.status: creating src/Makevars
** libs
/opt/local/bin/g++-mp-12 -std=gnu++14 -I"/opt/local/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -DRCPP_USE_UNWIND_PROTECT -DRCPP_NO_RTTI -DRCPP_NO_SUGAR -I. -I/opt/local/include -I/opt/local/include  -I'/opt/local/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include' -I'/opt/local/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RApiSerialize/include' -I'/opt/local/Library/Frameworks/R.framework/Versions/4.2/Resources/library/stringfish/include' -isystem/opt/local/include/LegacySupport -I/opt/local/include   -fPIC  -pipe -Os -arch ppc  -c RcppExports.cpp -o RcppExports.o
/opt/local/bin/g++-mp-12 -std=gnu++14 -I"/opt/local/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -DRCPP_USE_UNWIND_PROTECT -DRCPP_NO_RTTI -DRCPP_NO_SUGAR -I. -I/opt/local/include -I/opt/local/include  -I'/opt/local/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include' -I'/opt/local/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RApiSerialize/include' -I'/opt/local/Library/Frameworks/R.framework/Versions/4.2/Resources/library/stringfish/include' -isystem/opt/local/include/LegacySupport -I/opt/local/include   -fPIC  -pipe -Os -arch ppc  -c qs_functions.cpp -o qs_functions.o
/opt/local/bin/g++-mp-12 -std=gnu++14 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/opt/local/Library/Frameworks/R.framework/Resources/lib -Wl,-headerpad_max_install_names -Wl,-rpath,/opt/local/lib/libgcc -L/opt/local/lib -lMacportsLegacySupport -arch ppc -o qs.so RcppExports.o qs_functions.o -L. -lpthread -L/opt/local/lib -lzstd -L/opt/local/lib -llz4 -F/opt/local/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
installing to /opt/local/var/macports/build/_opt_PPCRosettaPorts_R_R-qs/R-qs/work/qs-f9a244b459289b7416d0e4c3c4ed05225811a229/qs.Rcheck/00LOCK-qs/00new/qs/libs
** R
** data
*** moving datasets to lazyload DB
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for ‘qs’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/opt/local/var/macports/build/_opt_PPCRosettaPorts_R_R-qs/R-qs/work/qs-f9a244b459289b7416d0e4c3c4ed05225811a229/qs.Rcheck/00LOCK-qs/00new/qs/libs/qs.so':
  dlopen(/opt/local/var/macports/build/_opt_PPCRosettaPorts_R_R-qs/R-qs/work/qs-f9a244b459289b7416d0e4c3c4ed05225811a229/qs.Rcheck/00LOCK-qs/00new/qs/libs/qs.so, 6): Symbol not found: ___atomic_store_8
  Referenced from: /opt/local/var/macports/build/_opt_PPCRosettaPorts_R_R-qs/R-qs/work/qs-f9a244b459289b7416d0e4c3c4ed05225811a229/qs.Rcheck/00LOCK-qs/00new/qs/libs/qs.so
  Expected in: dynamic lookup

Error: loading failed
Execution halted
ERROR: loading failed
* removing ‘/opt/local/var/macports/build/_opt_PPCRosettaPorts_R_R-qs/R-qs/work/qs-f9a244b459289b7416d0e4c3c4ed05225811a229/qs.Rcheck/qs’

As for documentation: linking to libatomic is needed when hardware does not support respective functionality natively:

Current releases use the newer __atomic intrinsics, which are implemented by library calls if the hardware doesn't support them. Undefined references to functions like __atomic_is_lock_free should be resolved by linking to libatomic. https://gcc.gnu.org/onlinedocs/libstdc++/manual/ext_concurrency_impl.html

ppc32 needs that: https://github.com/iains/darwin-toolchains-start-here/discussions/30#discussioncomment-4242172 Apparently arm32 needs it as well: https://github.com/abseil/abseil-cpp/issues/836 IMO this is a general case for 32-bit platforms. There might be some cases where specific 64-bit configurations require libatomic too: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358

barracuda156 avatar Mar 27 '23 20:03 barracuda156

@traversc Aside of atomics test, everything else works now, tests pass:

* using log directory ‘/opt/local/var/macports/build/_opt_PPCRosettaPorts_R_R-qs/R-qs/work/qs-f9a244b459289b7416d0e4c3c4ed05225811a229/qs.Rcheck’
* using R version 4.2.3 (2023-03-15)
* using platform: powerpc-apple-darwin10.8.0 (32-bit)
* using session charset: UTF-8
* checking for file ‘qs/DESCRIPTION’ ... OK
* checking extension type ... Package
* this is package ‘qs’ version ‘0.25.6’
* package encoding: UTF-8
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... WARNING
Found the following executable file:
  src/LICENSES/LZ4_LICENSE.txt
Source packages should not contain undeclared executable files.
See section ‘Package structure’ in the ‘Writing R Extensions’ manual.
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking whether package ‘qs’ can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking contents of ‘data’ directory ... OK
* checking data for non-ASCII characters ... OK
* checking LazyData ... OK
* checking data for ASCII and uncompressed saves ... OK
* checking line endings in shell scripts ... OK
* checking line endings in C/C++/Fortran sources/headers ... OK
* checking line endings in Makefiles ... OK
* checking compilation flags in Makevars ... OK
* checking for GNU extensions in Makefiles ... OK
* checking for portable use of $(BLAS_LIBS) and $(LAPACK_LIBS) ... OK
* checking use of PKG_*FLAGS in Makefiles ... OK
* checking compiled code ... OK
* checking files in ‘vignettes’ ... WARNING
Files in the 'vignettes' directory but no files in 'inst/doc':
  ‘altrep_bench.png’, ‘df_bench_read.png’, ‘df_bench_write.png’,
    ‘vignette.html’, ‘vignette.rmd’
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ... OK
  Running ‘correctness_testing.R’
  Running ‘qattributes_testing.R’
  Running ‘qsavemload_testing.R’
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes in ‘inst/doc’ ... WARNING
Directory 'inst/doc' does not exist.
Package vignette without corresponding single PDF/HTML:
   ‘vignette.rmd’

* checking running R code from vignettes ... NONE
  ‘vignette.rmd’ using ‘UTF-8’... OK
* checking re-building of vignette outputs ... OK
* checking PDF version of manual ... OK
* DONE
Status: 3 WARNINGs

barracuda156 avatar Mar 27 '23 21:03 barracuda156

I think I finally got the detection of GCC correct, see if it works now.

traversc avatar Mar 28 '23 05:03 traversc

I think I finally got the detection of GCC correct, see if it works now.

Yes, it works now! Thank you.

barracuda156 avatar Mar 28 '23 17:03 barracuda156