FR: Switch to yyjsonr (or similar) from jsonlite
A big bottleneck for large dataframes in reactable is JSON conversion. Simply by switching to a faster JSON conversion package you get a massive speedup.
library(reactable)
library(tibble)
df <- tibble(
a = 1:1e8
)
system.time(
reactable(df)
)
Result:
user system elapsed
53.11 6.42 60.92
Now, we monkey-patch reactable to use yyjsonr instead of jsonlite:
to_json_adapter <- function(x, ..., digits = -1, auto_unbox = TRUE) {
result <- yyjsonr::write_json_str(
x,
opts = yyjsonr::opts_write_json(
digits = digits,
auto_unbox = auto_unbox,
dataframe = "columns"
)
)
class(result) <- "json"
result
}
unlockBinding("toJSON", asNamespace("reactable"))
assign("toJSON", to_json_adapter, envir = asNamespace("reactable"))
lockBinding("toJSON", asNamespace("reactable"))
Re-running the same block of code:
system.time(
reactable(df)
)
We now get:
user system elapsed
2.89 1.47 5.41
That's a >10x speedup, just by changing the JSON library! This means you can squeeze out 10x the juice of client-side reactable before you have to upgrade to server-side rendering... that's huge.
Implementation note: we'll also want to run the faster JSON parser on the meta = argument (otherwise reactR uses slow jsonlite by default)
That is really neat, thanks for the suggestion. JSON serialization is indeed probably the biggest bottleneck for reactable. It would be useful for the toJSON() methods in htmlwidgets/reactR as well, which widgets generally will use by default unless overriding it.
Do you know if yyjson is supposed to be a fully 1:1 jsonlite drop-in replacement? The only concern I'd have is that it'll have to match jsonlite's behavior exactly, including all its quirks and bugs to avoid breaking anyone.
Do you know if yyjson is supposed to be a fully 1:1 jsonlite drop-in replacement?
Nope, there are slight differences: https://coolbutuseless.github.io/package/yyjsonr/articles/jsonlite-comparison.html
Instead of wholesale switching to yyjson (or some other json serializer), you may want to consider simply allowing users to choose their json serializing implementation (by an arg to reactable() or to options()... actually having both options would be nice). This way you won't break any legacy code but give folks the ability to easily use whatever json serializer they want (and as newer and faster implementations become available people can use them without any more effort on your part).
Examples:
reactable(
df,
toJSON = function(x) {
jsonlite::toJSON(
x,
dataframe = "columns",
rownames = FALSE,
digits = getOption("reactable.json.digits", NA),
POSIXt = "ISO8601",
Date = "ISO8601",
UTC = TRUE,
force = TRUE,
auto_unbox = TRUE,
null = "null"
)
}
)
or
reactable(
df,
toJSON = function(x) {
yyjsonr::write_json_str(
x,
dataframe = "columns",
digits = getOption("reactable.json.digits", NA),
auto_unbox = TRUE
)
}
)
or
options(
reactable.json.serializer = function(x) {
yyjsonr::write_json_str(
x,
dataframe = "columns",
digits = getOption("reactable.json.digits", NA),
auto_unbox = TRUE
)
}
)
etc.
I tried out yyjsonr and got it to mostly work, but there were a couple differences vs. jsonlite and possible bugs that I couldn't figure out how to get right. I think it'll be possible to fully switch over in the future, but we'd either need some workarounds or changes in yyjsonr itself.
So for now, I've switched meta over to use the internal JSON serializer:
https://github.com/glin/reactable/commit/76a3299ab21beaa67a61854bfa4f934d11440dab
metais now converted to JavaScript in the same way asdata, using reactable's internal JSON serialization function rather than htmlwidgets's JSON serialization function. This should only be a breaking change in rare cases, as the major difference is that numericNA,NaN,Inf, and-Infvalues are now serialized as strings and preserved, rather than always being converted tonull. (@khusmann, #415)
And added a reactable.json.func option, mirroring the name of htmlwidgets's TOJSON_FUNC attribute:
JSON serialization of data can now be customized using the
reactable.json.funcoption. This is an experimental feature for advanced use only, and intentionally undocumented outside of NEWS. reactable may change how data is serialized between versions and does not guarantee stability. Seereactable:::toJSONas a reference for how data is currently serialized. (@khusmann, #415)Example usage:
# Use yyjsonr as a faster alternative for JSON serialization. Note that this is not 1:1 consistent with # jsonlite, and several edge cases are not handled here, including data frames with 1 row, datetimes, and NULLs. options(reactable.json.func = function(x, ...) { result <- yyjsonr::write_json_str( x, opts = yyjsonr::opts_write_json( dataframe = "columns", auto_unbox = TRUE, num_specials = "string" ) ) class(result) <- "json" result })
These were the yyjsonr differences I could not work out, and may open issues over there to find out if these are bugs or things to add to the jsonlite compatibility document.
# dataframe = "columns" with auto_unbox = TRUE does not work for dataframes with one row
df <- data.frame(x = 1, y = "b")
jsonlite::toJSON(df, dataframe = "columns", auto_unbox = TRUE)
# {"x":[1],"y":["b"]}
write_json_str(df, opts = opts_write_json(dataframe = "columns", digits = 0, auto_unbox = TRUE))
# [1] "{\"x\":1,\"y\":\"b\"}"
write_json_str(df, opts = opts_write_json(dataframe = "columns", digits = 0))
# [1] "{\"x\":[1],\"y\":[\"b\"]}"
# numbers have 1 digit by default unless digits = 0 (not a big deal, but just odd. it shouldn't matter in JS, but does it matter in JSON?)
jsonlite::toJSON(5)
# [5]
write_json_str(5)
# [1] "[5.0]"
write_json_str(5, digits = 0)
# [1] "[5]"
# datetimes are not supported, known limitation: https://github.com/coolbutuseless/yyjsonr?tab=readme-ov-file#limitations
data <- data.frame(x = as.POSIXct("2019-05-06 3:22:15", tz = "UTC"), y = as.Date("2010-12-30"))
jsonlite::toJSON(data, Date = "ISO8601", POSIXt = "ISO8601", UTC = TRUE)
# [{"x":"2019-05-06T03:22:15Z","y":"2010-12-30"}]
write_json_str(data)
# [1] "[{\"x\":\"2019-05-06 03:22:15\",\"y\":\"2010-12-30\"}]"
# Unexpected NULL handling behavior
write_json_str(NULL, opts = opts_write_json(str_specials = "null"))
# [1] "[]"
write_json_str(NULL, opts = opts_write_json())
# [1] "[]"
jsonlite::toJSON(NULL, null = "null")
# null
df <- data.frame(x = I(list(NA, NULL)))
jsonlite::toJSON(df, null = "null", auto_unbox = TRUE, dataframe = "columns")
# {"x":[null,null]}
write_json_str(df, opts = opts_write_json(str_specials = "null", auto_unbox = TRUE, dataframe = "columns"))
# [1] "{\"x\":[null,[]]}"
This is fantastic work, and seems like a perfect place to leave things until all the yyjsonr differences are ironed out. Thanks for all your effort on this!