ProjectQ
ProjectQ copied to clipboard
atexit.unregister function speeds up computation sometimes
I have two (almost) identical simulations of a circuit in two different Ipython notebook. In one Ipython notebook the simulation is way faster than in the other. It is independent from the notebook because I know both cases happened in both notebooks. I appended a snippet of the logs of %prun at the end of the post. I checked in both cases the C++ compiler should be used (I assume this because of the usage of the _simulator.py:55(init) file in the log). I assume I did some initialization differently, but I do not know what. In the faster case the atexit.unregister function is used way longer, but I do not quite understand the effect of this function on the simulation.
I initialize the engine and circuit and do the measurement within the scipy.optimize.minimize function. So the broad code structure is
scipy.optimize.minimize(experiments, init_parameter, ...)
def experiments(parameter):
for in range(100):
eng = projectq.MainEngine(backend=projectq.backends.Simulator(gate_fusion=True), engine_list=[])
q = eng.allocate_qubit()
circuit(parameter, eng, q)
projectq.ops.Measure | q
eng.flush()
do some non-projectq related postprocessing
But I also tried to put all code in one function and initialize the engine at different spots, but it seems to not change the result.
Here a snapshot of the log of the fast computation
1066623 function calls (1066621 primitive calls) in 3.150 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
766 1.272 0.002 1.272 0.002 {built-in method atexit.unregister}
13000 0.277 0.000 1.098 0.000 _simulator.py:350(_handle)
14600 0.180 0.000 1.295 0.000 _simulator.py:422(receive)
23421 0.135 0.000 0.135 0.000 {built-in method numpy.core.multiarray.array}
23400 0.117 0.000 0.221 0.000 {built-in method __new__ of type object at 0x9e5d60}
7200 0.115 0.000 0.130 0.000 {built-in method numpy.core.multiarray.dot}
23401 0.090 0.000 0.090 0.000 {built-in method _warnings.warn}
23400 0.085 0.000 0.544 0.000 defmatrix.py:112(__new__)
14600 0.064 0.000 1.419 0.000 _command.py:86(__init__)
217841 0.053 0.000 0.053 0.000 {built-in method builtins.isinstance}
13600 0.045 0.000 0.057 0.000 _basics.py:123(make_tuple_of_qureg)
37800 0.039 0.000 0.120 0.000 defmatrix.py:164(__array_finalize__)
16800 0.037 0.000 0.038 0.000 {built-in method builtins.sorted}
9800 0.034 0.000 0.243 0.000 _basics.py:166(generate_command)
800 0.033 0.000 1.607 0.002 <ipython-input-7-1a377ac0e731>:1(h2_bk_circuit)
7200 0.030 0.000 0.320 0.000 _gates.py:55(matrix)
2200 0.026 0.000 0.453 0.000 _metagates.py:190(__or__)
14600 0.024 0.000 0.059 0.000 _command.py:214(control_qubits)
800 0.024 0.000 0.030 0.000 _simulator.py:55(__init__)
14600 0.022 0.000 0.026 0.000 _command.py:173(_order_qubits)
14600 0.021 0.000 0.026 0.000 _command.py:263(engine)
14600 0.017 0.000 0.100 0.000 _command.py:109(<listcomp>)
29200 0.017 0.000 0.772 0.000 _command.py:109(<genexpr>)
4800 0.016 0.000 0.121 0.000 _gates.py:211(matrix)
14600 0.014 0.000 0.040 0.000 _command.py:123(qubits)
9000 0.013 0.000 0.258 0.000 _gates.py:68(matrix)
800 0.013 0.000 3.056 0.004 <ipython-input-8-990688fbed52>:2(run_h2_bk_circuit)
37600 0.013 0.000 0.021 0.000 _basics.py:202(__eq__)
8200 0.012 0.000 1.341 0.000 _basics.py:184(__or__)
9800 0.012 0.000 1.168 0.000 _command.py:47(apply_command)
1600 0.012 0.000 0.068 0.000 _basics.py:134(deallocate_qubit)
14600 0.011 0.000 1.354 0.000 _main.py:268(send)
800 0.010 0.000 0.015 0.000 _main.py:57(__init__)
21600 0.010 0.000 0.010 0.000 _qubit.py:44(__init__)
and here from the slow computation
1048203 function calls (1048201 primitive calls) in 44.361 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
13000 41.300 0.003 43.490 0.003 _simulator.py:350(_handle)
7200 1.401 0.000 1.420 0.000 {built-in method numpy.core.multiarray.dot}
23421 0.164 0.000 0.164 0.000 {built-in method numpy.core.multiarray.array}
23400 0.147 0.000 0.188 0.000 {built-in method __new__ of type object at 0x9e5d60}
23401 0.115 0.000 0.115 0.000 {built-in method _warnings.warn}
23400 0.103 0.000 0.589 0.000 defmatrix.py:112(__new__)
217041 0.075 0.000 0.075 0.000 {built-in method builtins.isinstance}
13800 0.074 0.000 0.288 0.000 _command.py:86(__init__)
13800 0.065 0.000 43.576 0.003 _simulator.py:422(receive)
13600 0.057 0.000 0.074 0.000 _basics.py:123(make_tuple_of_qureg)
16000 0.046 0.000 0.047 0.000 {built-in method builtins.sorted}
37800 0.045 0.000 0.060 0.000 defmatrix.py:164(__array_finalize__)
9800 0.041 0.000 0.303 0.000 _basics.py:166(generate_command)
800 0.040 0.000 27.540 0.034 <ipython-input-4-a0c9527638d5>:4(h2_bk_circuit)
7200 0.036 0.000 1.648 0.000 _gates.py:55(matrix)
2200 0.032 0.000 7.462 0.003 _metagates.py:190(__or__)
13800 0.027 0.000 0.071 0.000 _command.py:214(control_qubits)
13800 0.026 0.000 0.033 0.000 _command.py:173(_order_qubits)
13800 0.025 0.000 0.031 0.000 _command.py:263(engine)
779 0.020 0.000 0.020 0.000 {built-in method atexit.unregister}
4800 0.019 0.000 0.145 0.000 _gates.py:211(matrix)
13800 0.019 0.000 0.027 0.000 _command.py:109(<listcomp>)
1600 0.018 0.000 3.011 0.002 _basics.py:85(allocate_qubit)
800 0.016 0.000 0.024 0.000 _simulator.py:55(__init__)
27600 0.016 0.000 0.056 0.000 _command.py:109(<genexpr>)
9000 0.016 0.000 0.244 0.000 _gates.py:68(matrix)
9800 0.016 0.000 34.245 0.003 _command.py:47(apply_command)
13800 0.015 0.000 0.049 0.000 _command.py:123(qubits)
36800 0.015 0.000 0.028 0.000 _basics.py:202(__eq__)
8200 0.015 0.000 27.376 0.003 _basics.py:184(__or__)
800 0.015 0.000 37.818 0.047 <ipython-input-5-0573d3543f43>:5(run_h2_bk_circuit)
13800 0.014 0.000 43.620 0.003 _main.py:268(send)
1600 0.013 0.000 6.508 0.004 _basics.py:134(deallocate_qubit)
52633 0.012 0.000 0.012 0.000 {built-in method builtins.len}
20000 0.012 0.000 0.012 0.000 _qubit.py:44(__init__)
800 0.011 0.000 0.016 0.000 _main.py:57(__init__)
2400 0.011 0.000 0.023 0.000 _basics.py:243(__init__)
9800 0.010 0.000 0.017 0.000 {built-in method builtins.all}
2400 0.010 0.000 0.010 0.000 {built-in method builtins.round}
8200 0.010 0.000 0.024 0.000 defmatrix.py:261(tolist)
17000 0.008 0.000 0.008 0.000 _basics.py:65(__init__)
2400 0.008 0.000 0.073 0.000 _gates.py:231(matrix)
10600 0.008 0.000 34.235 0.003 _main.py:258(receive)
9800 0.008 0.000 0.008 0.000 _basics.py:179(<listcomp>)
1600 0.007 0.000 6.534 0.004 _qubit.py:121(__del__)
Are you able to reproduce this?
Did you export OMP_NUM_THREADS=1
for the fast version? Because if you only use a single qubit, you shouldn't be using multiple threads. Also, gate fusion should be turned off for a single qubit since the simulator will perform a bunch of matrix-matrix multiplications if it's on, instead of just matrix-vector mult.