BentoML icon indicating copy to clipboard operation
BentoML copied to clipboard

Use `orjson` for JSON serialization

Open judahrand opened this issue 1 year ago • 2 comments

What does this PR address?

orjson seems to perform ~10x better than the stdlib json for roundtrip serialization:

In [1]: import json

In [2]: import orjson

In [3]: import random

In [4]: py_list = [random.random() for x in range(1000)]

In [5]: %timeit orjson.loads(orjson.dumps(py_list))
40.9 μs ± 178 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [6]: %timeit json.loads(json.dumps(py_list))
522 μs ± 8.43 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [7]: %timeit orjson.loads(orjson.dumps('string'))
93.4 ns ± 0.882 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [8]: %timeit json.loads(json.dumps('string'))
824 ns ± 3.81 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Fixes #(issue)

Before submitting:

judahrand avatar Jun 13 '24 13:06 judahrand

This should better be an optional dependency and expose a uniform API from a wrapper module.

frostming avatar Jun 14 '24 00:06 frostming

This should better be an optional dependency and expose a uniform API from a wrapper module.

Why do you think that is necessary? pydantic is already a dependency which (if wheels are not available) requires a Rust toolchain. It doesn't feel like adding orjson is much of a stretch especially since it builds wheels for basically every platform. In addition it is only ~500kB.

judahrand avatar Jun 14 '24 08:06 judahrand