Cirq icon indicating copy to clipboard operation
Cirq copied to clipboard

6097: Migrate from default Python serialization to orjson

Open navaro1 opened this issue 2 years ago • 6 comments

  1. The performance improvement is right now about ~8 times better (for 1000 qubits and 100 num_moments from 1.15±0.01s to 147±1ms). This puts us in the ballpark figure mentioned here: https://github.com/quantumlib/Cirq/pull/5957#issuecomment-1346837558
  2. orjson only supports none indentation or indentation of size 2 - source
  3. orjson does not support handling separators: Optional[Tuple[str, str]] - source

Performance differences (will be updated with consequent changes)

Before change (run with indent=None, default json library)

[100.00%] ··· serialization.SerializeLargeExpandedCircuits.time_json_serialization                                                                                                                                           ok
[100.00%] ··· ============ ============ ============
              --                  num_moments       
              ------------ -------------------------
               num_qubits      100          200     
              ============ ============ ============
                  100        114±2ms      231±4ms   
                  1000      1.15±0.01s   2.30±0.01s 
              ============ ============ ============

After changes (orjson library)

[ 37.50%] ··· ============ ============= ============= ============== ============ =============
              --                                             indent / contextual_serialization                    
              -------------------------- -------------------------------------------------------
               num_qubits   num_moments   None / True   None / False    2 / True     2 / False  
              ============ ============= ============= ============== ============ =============
                  100           100        50.4±0.6ms    14.4±0.1ms    51.2±0.5ms   14.7±0.08ms 
                  100           1000        515±1ms      145±0.9ms      521±1ms       155±1ms   
                  100           4000       2.06±0.01s     597±4ms       2.11±0s       639±5ms   
                  500           100         263±2ms      74.7±0.3ms    258±0.8ms     75.6±0.3ms 
                  500           1000       2.58±0.01s     757±2ms      2.61±0.01s     793±4ms   
                  500           4000       10.3±0.01s    3.04±0.05s     10.6±0s       3.17±0s   
                  1000          100         514±2ms       147±1ms       531±6ms       157±2ms   
                  1000          1000       5.21±0.02s    1.53±0.01s     5.29±0s      1.60±0.01s 
                  1000          4000       21.1±0.1s     6.14±0.04s    21.5±0.06s    6.44±0.04s 
              ============ ============= ============= ============== ============ =============

navaro1 avatar Jun 01 '23 04:06 navaro1

Is the new output compatible with serialization produced with builtin json?

Ie, after this PR will the serialization before and after orjson stay compatible?

pavoljuhas avatar Jun 07 '23 17:06 pavoljuhas

From Cirq sync:

Doug: We are adding a new dependency to Cirq. Is orjson well maintained and have good compliance / review process? Especially since we are talking about binary formats.

tanujkhattar avatar Jun 07 '23 17:06 tanujkhattar

Is the new output compatible with serialization produced with builtin json?

Based on the list of the features and drawbacks in https://github.com/ijl/orjson#orjson, it is stricter than the stdlib json, in conforming to the JSON spec.

Is orjson well maintained and have good compliance / review process?

I have seen that orjson has been used in Zulip for a while. The discussion comparing the alternatives can be found at https://github.com/zulip/zulip/issues/6507.

rht avatar Jun 08 '23 01:06 rht

Is the new output compatible with serialization produced with builtin json?

Almost always yes. read methods were not touched, and the tests are passing. The serialization output is not compatible when indendation is other than None or 2. orjson does not support other indendation levels.

navaro1 avatar Jun 08 '23 05:06 navaro1

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 97.84%. Comparing base (26dbabc) to head (62848b4). Report is 84 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #6115   +/-   ##
=======================================
  Coverage   97.84%   97.84%           
=======================================
  Files        1110     1110           
  Lines       96696    96709   +13     
=======================================
+ Hits        94612    94625   +13     
  Misses       2084     2084           

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Aug 05 '23 17:08 codecov[bot]

xref https://github.com/quantumlib/Cirq/issues/6315

We now have a use case where serialization performance is becoming a bottleneck. @suyashdamle to check into whether adding the additional dependency to orjson is compliant with Google policy.

tanujkhattar avatar Nov 08 '23 19:11 tanujkhattar