6097: Migrate from default Python serialization to orjson
- The performance improvement is right now about ~8 times better (for 1000
qubitsand 100num_momentsfrom1.15±0.01sto147±1ms). This puts us in the ballpark figure mentioned here: https://github.com/quantumlib/Cirq/pull/5957#issuecomment-1346837558 -
orjsononly supports none indentation or indentation of size 2 - source -
orjsondoes not support handlingseparators: Optional[Tuple[str, str]]- source
Performance differences (will be updated with consequent changes)
Before change (run with indent=None, default json library)
[100.00%] ··· serialization.SerializeLargeExpandedCircuits.time_json_serialization ok
[100.00%] ··· ============ ============ ============
-- num_moments
------------ -------------------------
num_qubits 100 200
============ ============ ============
100 114±2ms 231±4ms
1000 1.15±0.01s 2.30±0.01s
============ ============ ============
After changes (orjson library)
[ 37.50%] ··· ============ ============= ============= ============== ============ =============
-- indent / contextual_serialization
-------------------------- -------------------------------------------------------
num_qubits num_moments None / True None / False 2 / True 2 / False
============ ============= ============= ============== ============ =============
100 100 50.4±0.6ms 14.4±0.1ms 51.2±0.5ms 14.7±0.08ms
100 1000 515±1ms 145±0.9ms 521±1ms 155±1ms
100 4000 2.06±0.01s 597±4ms 2.11±0s 639±5ms
500 100 263±2ms 74.7±0.3ms 258±0.8ms 75.6±0.3ms
500 1000 2.58±0.01s 757±2ms 2.61±0.01s 793±4ms
500 4000 10.3±0.01s 3.04±0.05s 10.6±0s 3.17±0s
1000 100 514±2ms 147±1ms 531±6ms 157±2ms
1000 1000 5.21±0.02s 1.53±0.01s 5.29±0s 1.60±0.01s
1000 4000 21.1±0.1s 6.14±0.04s 21.5±0.06s 6.44±0.04s
============ ============= ============= ============== ============ =============
Is the new output compatible with serialization produced with builtin json?
Ie, after this PR will the serialization before and after orjson stay compatible?
From Cirq sync:
Doug: We are adding a new dependency to Cirq. Is orjson well maintained and have good compliance / review process? Especially since we are talking about binary formats.
Is the new output compatible with serialization produced with builtin json?
Based on the list of the features and drawbacks in https://github.com/ijl/orjson#orjson, it is stricter than the stdlib json, in conforming to the JSON spec.
Is orjson well maintained and have good compliance / review process?
I have seen that orjson has been used in Zulip for a while. The discussion comparing the alternatives can be found at https://github.com/zulip/zulip/issues/6507.
Is the new output compatible with serialization produced with builtin json?
Almost always yes. read methods were not touched, and the tests are passing. The serialization output is not compatible when indendation is other than None or 2. orjson does not support other indendation levels.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 97.84%. Comparing base (
26dbabc) to head (62848b4). Report is 84 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #6115 +/- ##
=======================================
Coverage 97.84% 97.84%
=======================================
Files 1110 1110
Lines 96696 96709 +13
=======================================
+ Hits 94612 94625 +13
Misses 2084 2084
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
xref https://github.com/quantumlib/Cirq/issues/6315
We now have a use case where serialization performance is becoming a bottleneck. @suyashdamle to check into whether adding the additional dependency to orjson is compliant with Google policy.