Cirq
Cirq copied to clipboard
Can't use EngineSampler in multiprocessing
I was trying to call run
from a function which I am using multiprocessing.Pool.map
to run a couple in parallel. Multiprocessing requires the function you're mapping over to be pickle-able to send it to the spawned processes. Lambda functions aren't pickleable, and there's something in cirq.google
gumming up the works:
~/cirq/cirq/cirq/experiments/fidelity_estimation.py in sample_2q_xeb_circuits(sampler, circuits, cycle_depths, repetitions)
444 with tqdm.tqdm(total=n_tasks) as progress:
445 with multiprocessing.Pool(2) as pool:
--> 446 records = pool.map(executor, tasks)
447 progress.update(chunksize)
448
~/cirq/env/lib/python3.7/multiprocessing/pool.py in map(self, func, iterable, chunksize)
266 in a list that is returned.
267 '''
--> 268 return self._map_async(func, iterable, mapstar, chunksize).get()
269
270 def starmap(self, func, iterable, chunksize=None):
~/cirq/env/lib/python3.7/multiprocessing/pool.py in get(self, timeout)
655 return self._value
656 else:
--> 657 raise self._value
658
659 def _set(self, i, obj):
~/cirq/env/lib/python3.7/multiprocessing/pool.py in _handle_tasks(taskqueue, put, outqueue, pool, cache)
429 break
430 try:
--> 431 put(task)
432 except Exception as e:
433 job, idx = task[:2]
~/cirq/env/lib/python3.7/multiprocessing/connection.py in send(self, obj)
204 self._check_closed()
205 self._check_writable()
--> 206 self._send_bytes(_ForkingPickler.dumps(obj))
207
208 def recv_bytes(self, maxlength=None):
~/cirq/env/lib/python3.7/multiprocessing/reduction.py in dumps(cls, obj, protocol)
49 def dumps(cls, obj, protocol=None):
50 buf = io.BytesIO()
---> 51 cls(buf, protocol).dump(obj)
52 return buf.getbuffer()
53
PicklingError: Can't pickle <function <lambda> at 0x7fb82ee9e5f0>: attribute lookup <lambda> on cirq.google.common_serializers failed
Yeah, the serialization code uses lambdas all over the place for extracting and validating gate parameters. We can convert those to be instances of proper classes instead to support pickling with multiprocess. I think generally there's a lot of possibilities for cleaning up this code.
For people reading this, I switched to ThreadPoolExecutor
which works, but I think people might expect our library to work under multiprocessing
, especially given Python's threading drawbacks
Latest on this issue:
We do have async methods on Samplers now. Does this make this obsolete ? Do any of the other cirq-google workflow tools make this obsolete ? Should we still pursue this ?
Questions for @mpharrigan
The async methods on EngineSampler
currently use the default versions, which fall back to the sync versions and so don't actually do anything asynchronous. Actual support for async calls to the quantum engine API is in the works here: https://github.com/quantumlib/Cirq/pull/5219.
Separately, I do think we should refactor the lambdas in the circuit serializer code.
We got rid of the lambdas in serializers and implemented async operations as well as streaming.
I think this is obsolete, as there is a good work-around and most of the original issue is also fixed (though I didn't test multiprocessing). I am not sure we want to officially support multiprocess.
common_serializers.py were removed in #5764. Closing as obsolete.