qs icon indicating copy to clipboard operation
qs copied to clipboard

The future of qs

Open traversc opened this issue 1 year ago • 7 comments

I plan to deprecate the qs package in the future. There is a replacement available, qs2, on CRAN and GitHub (1).

There are two reasons:

  • New CRAN enforcement regarding certain internal functions that qs relies on (2). Since qs handles serialization for both data and internal objects, maintaining proper serialization has become difficult without broader access to these now-restricted functions.

  • qs was first released in 2019. Since then, there have been numerous changes/improvements to the internals of R and therefore its serialization of internal objects. As a result, R updates have sometimes caused qs to break in unexpected ways. Those breaks obviously cause disruption and have been time consuming to fix.

The new qs2 package addresses these issues.

It uses only approved API functions and is designed to be more future-proof. The package has two new formats:

  • The qs2 format uses R's built-in serialization but improves upon it with better file I/O, zstd compression, byte shuffling and multithreading. This is a good 80/20 solution and doesn`t require any update to ensure it works in the future.

  • The qdata format, a spiritual successor to qs, features its own serialization for data only (vectors, data frames, lists, matrices, attributes). It outperforms qs and qs2 formats, especially with multithreading (3) and I also plan for limited cross-compatibility with Python later on.

Thanks to everyone who used the qs package over the years and I hope qs2 will be a worthy successor!

(3) Benchmarks (4.5 GB mixed numeric/text data)

Single-threaded

Algorithm Compression Save Time (s) Read Time (s)
qs2 7.96 13.4 50.4
qdata 8.45 10.5 34.8
base::serialize 1.1 8.87 51.4
saveRDS 8.68 107 63.7
fst 2.59 5.09 46.3
parquet 8.29 20.3 38.4
qs (legacy) 7.97 9.13 48.1

Multi-threaded (8 threads)

Algorithm Compression Save Time (s) Read Time (s)
qs2 7.96 3.79 48.1
qdata 8.45 1.98 33.1
fst 2.59 5.05 46.6
parquet 8.29 20.2 37.0
qs (legacy) 7.97 3.21 52.0

traversc avatar Sep 28 '24 03:09 traversc