The future of qs
I plan to deprecate the qs package in the future. There is a replacement available, qs2, on CRAN and GitHub (1).
There are two reasons:
-
New CRAN enforcement regarding certain internal functions that qs relies on (2). Since qs handles serialization for both data and internal objects, maintaining proper serialization has become difficult without broader access to these now-restricted functions.
-
qswas first released in 2019. Since then, there have been numerous changes/improvements to the internals of R and therefore its serialization of internal objects. As a result, R updates have sometimes causedqsto break in unexpected ways. Those breaks obviously cause disruption and have been time consuming to fix.
The new qs2 package addresses these issues.
It uses only approved API functions and is designed to be more future-proof. The package has two new formats:
-
The
qs2format uses R's built-in serialization but improves upon it with better file I/O, zstd compression, byte shuffling and multithreading. This is a good 80/20 solution and doesn`t require any update to ensure it works in the future. -
The
qdataformat, a spiritual successor toqs, features its own serialization for data only (vectors, data frames, lists, matrices, attributes). It outperformsqsandqs2formats, especially with multithreading (3) and I also plan for limited cross-compatibility with Python later on.
Thanks to everyone who used the qs package over the years and I hope qs2 will be a worthy successor!
(3) Benchmarks (4.5 GB mixed numeric/text data)
Single-threaded
| Algorithm | Compression | Save Time (s) | Read Time (s) |
|---|---|---|---|
| qs2 | 7.96 | 13.4 | 50.4 |
| qdata | 8.45 | 10.5 | 34.8 |
| base::serialize | 1.1 | 8.87 | 51.4 |
| saveRDS | 8.68 | 107 | 63.7 |
| fst | 2.59 | 5.09 | 46.3 |
| parquet | 8.29 | 20.3 | 38.4 |
| qs (legacy) | 7.97 | 9.13 | 48.1 |
Multi-threaded (8 threads)
| Algorithm | Compression | Save Time (s) | Read Time (s) |
|---|---|---|---|
| qs2 | 7.96 | 3.79 | 48.1 |
| qdata | 8.45 | 1.98 | 33.1 |
| fst | 2.59 | 5.05 | 46.6 |
| parquet | 8.29 | 20.2 | 37.0 |
| qs (legacy) | 7.97 | 3.21 | 52.0 |
Thanks @traversc for this note. Your decision is very respectable, although I would urge to keep qs on CRAN as long as possible. Many packages have not, or cannot, move into full API compliance. Major packages including data.table would have great difficulties to do that. I don't see CRAN maintainers starting to crack down on non-complient packages, especially packages that are highly depended upon such as qs.
Thanks @SebKrantz , for now CRAN isn't forcing the issue. I hope data.table gets the official support they need.
@traversc perhaps one more note here, the CRAN policy suggests that only major x.y.0 updates may be forced to fix all issues. So it should be possible to keep .qs going with minor updates.
@SebKrantz Are you referring to this part?
Maintainers will be asked to update packages which show any warnings or significant notes, especially at around the time of a new x.y.0 release. Packages which are not updated are liable to be archived.
R 4.5 is scheduled for Spring which is hopefully enough time to gracefully deprecate everything.
Will existing .qs files be readable via qs2? Thank you for your work.
Hi Hugh, no it's not possible to read qs files with qs2.
But potentially a subset of the qs format could be read in while being API compliant, excluding language/function/internal objects which could be converted to NULL.
First of all, thank you very much for the amazing package!
Now, regarding the issue... I am humbly wondering what prevents the qs2 from being deprecated sometime in the future too?
It might be helpful to include this warning in the README and/or DESCRIPTION files. I suspect many users may not check the issues section, which may explain why packages continue to add qs rather than qs2 as a dependency.
I added a notice to the top of the README. I'll leave the DESCRIPTION file as is as I dont plan to update it on CRAN.