Differences between AerJob and IBMQJob for failing jobs
Informations
- Qiskit AER version: 0.10.4
- Operating system: Linux
What is the current behavior?
Suppose that a job fails. For example, it contains a circuit with gates that don't belong to the backend's set of basis gates. The job failure looks pretty different if we run it on a real device or on a simulator:
-
job.result()will raise an exception for real devices, but not for simulators. - The job's status will be
ERRORfor real devices, andDONEfor simulators (for simulators, the information that an error has occured is hiding in astatusfield inside the result). -
job.error_message()will work only for devices, for simulators it will raise anAttributeErrorexception.
I'd like to point out that this is a real and even annoying problem. I'm writing code for qiskit-experiments. Our experiments typically run on devices, but our tests and tutorials must run on the simulator (when applicable, i.e. measurement level 2 etc.), either directly or via a fake backend. Now:
- We don't want to fill our code with ifs or try-excepts to distinguish between job types.
- Even if we did, we would still have an issue with the coverage of our tests. Because the tests wouldn't cover the if sections that apply only to devices.
Specifically I have a concrete bug in qiskit-experiments right now (https://github.com/Qiskit/qiskit-experiments/issues/866) that I can test and debug only on a device. When I fix it, I'll want to add a test to our test suite, that will be able to catch the bug in the future if it returns. To overcome what I've just described, here is what I intend to do (I still don't know if it will work): for the sake of the test, I'll write a subclass of AerSimulator and a subclass of AerJob; I'll tweak their behavior to mimic devices.
Steps to reproduce the problem
Code:
from qiskit import QuantumCircuit, Aer, IBMQ
provider = IBMQ.load_account()
backends = [Aer.get_backend("aer_simulator_stabilizer"), provider.backend.ibmq_lima]
circ = QuantumCircuit(1, 1)
circ.t(0)
circ.measure(0, 0)
for backend in backends:
print("\nRunning with", backend.name())
job = backend.run(circ)
try:
print("Job result:", job.result())
except Exception as e:
print("Exception when trying to fetch job result:", e)
print(f" (exception type is {type(e).__name__})")
print("Job status:", job.status())
try:
print("Job's error message:", job.error_message())
except Exception as e:
print("Exception when trying to fetch job's error message:", e)
print(f" (exception type is {type(e).__name__})")
Output:
Running with aer_simulator_stabilizer
Simulation failed and returned the following error message:
ERROR: [Experiment 0] Circuit circuit-80 contains invalid instructions {"gates": {t}} for "stabilizer" method.
Job result: Result(backend_name='aer_simulator_stabilizer', backend_version='0.10.4', qobj_id='b51d3bc1-0f3d-44ff-a095-e61e0c8b2acf', job_id='cebe6869-9913-455d-a73f-dd2995f272fb', success=False, results=[ExperimentResult(shots=0, success=False, meas_level=2, data=ExperimentResultData(), status=ERROR: Circuit circuit-80 contains invalid instructions {"gates": {t}} for "stabilizer" method., seed_simulator=0, metadata={'noise': 'ideal', 'batched_shots_optimization': False, 'measure_sampling': False, 'device': 'CPU', 'num_qubits': 1, 'remapped_qubits': False, 'method': 'stabilizer', 'active_input_qubits': [0], 'num_clbits': 1, 'input_qubit_map': [[0, 0]]}, time_taken=0.0)], date=2022-08-04T17:35:12.808589, status=ERROR: [Experiment 0] Circuit circuit-80 contains invalid instructions {"gates": {t}} for "stabilizer" method., header=QobjHeader(backend_name='aer_simulator_stabilizer', backend_version='0.10.4'), metadata={'time_taken': 0.000158406, 'time_taken_execute': 5.7428e-05, 'parallel_experiments': 1, 'omp_enabled': True, 'max_gpu_memory_mb': 0, 'num_mpi_processes': 1, 'time_taken_load_qobj': 9.4399e-05, 'max_memory_mb': 7068, 'mpi_rank': 0}, time_taken=0.00026607513427734375)
Job status: JobStatus.DONE
Exception when trying to fetch job's error message: 'AerJob' object has no attribute 'error_message'
(exception type is AttributeError)
Running with ibmq_lima
Exception when trying to fetch job result: "Unable to retrieve result for job 62ebd9224f71a3657b0dadce. Job has failed: The Qobj uses gates (['t']) that are not among the basis gates (['id', 'rz', 'sx', 'x', 'cx', 'reset']). Error code: 1106."
(exception type is IBMQJobFailureError)
Job status: JobStatus.ERROR
Job's error message: The Qobj uses gates (['t']) that are not among the basis gates (['id', 'rz', 'sx', 'x', 'cx', 'reset']). Error code: 1106.
(We see the same behavior also when we use the regular, non-stabilizer simulator with the ch gate, which is not in its basis gates).
What is the expected behavior?
Identical behavior of jobs, regardless of their origin.
Specifically I have a concrete bug in qiskit-experiments right now (https://github.com/Qiskit/qiskit-experiments/issues/866) that I can test and debug only on a device. When I fix it, I'll want to add a test to our test suite, that will be able to catch the bug in the future if it returns. To overcome what I've just described, here is what I intend to do (I still don't know if it will work): for the sake of the test, I'll write a subclass of
AerSimulatorand a subclass ofAerJob; I'll tweak their behavior to mimic devices.
This is how the hack looks like:
class MyBackend(AerSimulator):
def run(self, run_input, **options):
job = super().run(run_input, **options)
job.__class__ = MyJob
return job
class MyJob(AerJob):
def result(self, timeout=None):
raise QiskitError
def status(self):
return JobStatus.ERROR
def error_message(self):
return "You're dealing with the wrong job, man"
I think Aer and IBMQ handles job status differently.
IBMQ: JobStatus.ERROR if experiments or Job scheduler fail
Aer: JobStatus.ERROR if Job scheduler (thread pool executor) fails
I proposed a new option ibmq_semantics=True in #1575 and would like to discuss whether it is reasonable.
Additional cases where Aer's behavior differs from devices:
- Configuration of
AerSimulatorstates thatn_qubitsis equal to 28, however Aer agrees to simulate circuits with 30 qubits. - When Aer is run with
measurement_levelset to 1, it ignores this option and runs as usual, producing counts; I'd expect an exception saying that this is not supported,
As #1575 is rejected, Aer can not maintain two semantics. Also, because Aer needs to identify fails of an executor (including thread pool and DASK executor) and experiments, JobStatus.ERROR should inform only errors of executors and can not use the same semantics with IBMQ provider. I think similar semantics differences will be appeared if third party provider is used and only callers can fix them.