qiskit-aer icon indicating copy to clipboard operation
qiskit-aer copied to clipboard

Differences between AerJob and IBMQJob for failing jobs

Open yaelbh opened this issue 2 years ago • 3 comments

Informations

  • Qiskit AER version: 0.10.4
  • Operating system: Linux

What is the current behavior?

Suppose that a job fails. For example, it contains a circuit with gates that don't belong to the backend's set of basis gates. The job failure looks pretty different if we run it on a real device or on a simulator:

  1. job.result() will raise an exception for real devices, but not for simulators.
  2. The job's status will be ERROR for real devices, and DONE for simulators (for simulators, the information that an error has occured is hiding in a status field inside the result).
  3. job.error_message() will work only for devices, for simulators it will raise an AttributeError exception.

I'd like to point out that this is a real and even annoying problem. I'm writing code for qiskit-experiments. Our experiments typically run on devices, but our tests and tutorials must run on the simulator (when applicable, i.e. measurement level 2 etc.), either directly or via a fake backend. Now:

  1. We don't want to fill our code with ifs or try-excepts to distinguish between job types.
  2. Even if we did, we would still have an issue with the coverage of our tests. Because the tests wouldn't cover the if sections that apply only to devices.

Specifically I have a concrete bug in qiskit-experiments right now (https://github.com/Qiskit/qiskit-experiments/issues/866) that I can test and debug only on a device. When I fix it, I'll want to add a test to our test suite, that will be able to catch the bug in the future if it returns. To overcome what I've just described, here is what I intend to do (I still don't know if it will work): for the sake of the test, I'll write a subclass of AerSimulator and a subclass of AerJob; I'll tweak their behavior to mimic devices.

Steps to reproduce the problem

Code:

from qiskit import QuantumCircuit, Aer, IBMQ

provider = IBMQ.load_account()
backends = [Aer.get_backend("aer_simulator_stabilizer"), provider.backend.ibmq_lima]

circ = QuantumCircuit(1, 1)
circ.t(0)
circ.measure(0, 0)

for backend in backends:
    print("\nRunning with", backend.name())
    job = backend.run(circ)

    try:
        print("Job result:", job.result())
    except Exception as e:
        print("Exception when trying to fetch job result:", e)
        print(f"     (exception type is {type(e).__name__})")

    print("Job status:", job.status())
    try:
        print("Job's error message:", job.error_message())
    except Exception as e:
        print("Exception when trying to fetch job's error message:", e)
        print(f"     (exception type is {type(e).__name__})")

Output:

Running with aer_simulator_stabilizer
Simulation failed and returned the following error message:
ERROR:  [Experiment 0] Circuit circuit-80 contains invalid instructions {"gates": {t}} for "stabilizer" method.
Job result: Result(backend_name='aer_simulator_stabilizer', backend_version='0.10.4', qobj_id='b51d3bc1-0f3d-44ff-a095-e61e0c8b2acf', job_id='cebe6869-9913-455d-a73f-dd2995f272fb', success=False, results=[ExperimentResult(shots=0, success=False, meas_level=2, data=ExperimentResultData(), status=ERROR: Circuit circuit-80 contains invalid instructions {"gates": {t}} for "stabilizer" method., seed_simulator=0, metadata={'noise': 'ideal', 'batched_shots_optimization': False, 'measure_sampling': False, 'device': 'CPU', 'num_qubits': 1, 'remapped_qubits': False, 'method': 'stabilizer', 'active_input_qubits': [0], 'num_clbits': 1, 'input_qubit_map': [[0, 0]]}, time_taken=0.0)], date=2022-08-04T17:35:12.808589, status=ERROR:  [Experiment 0] Circuit circuit-80 contains invalid instructions {"gates": {t}} for "stabilizer" method., header=QobjHeader(backend_name='aer_simulator_stabilizer', backend_version='0.10.4'), metadata={'time_taken': 0.000158406, 'time_taken_execute': 5.7428e-05, 'parallel_experiments': 1, 'omp_enabled': True, 'max_gpu_memory_mb': 0, 'num_mpi_processes': 1, 'time_taken_load_qobj': 9.4399e-05, 'max_memory_mb': 7068, 'mpi_rank': 0}, time_taken=0.00026607513427734375)
Job status: JobStatus.DONE
Exception when trying to fetch job's error message: 'AerJob' object has no attribute 'error_message'
     (exception type is AttributeError)

Running with ibmq_lima
Exception when trying to fetch job result: "Unable to retrieve result for job 62ebd9224f71a3657b0dadce. Job has failed: The Qobj uses gates (['t']) that are not among the basis gates (['id', 'rz', 'sx', 'x', 'cx', 'reset']). Error code: 1106."
     (exception type is IBMQJobFailureError)
Job status: JobStatus.ERROR
Job's error message: The Qobj uses gates (['t']) that are not among the basis gates (['id', 'rz', 'sx', 'x', 'cx', 'reset']). Error code: 1106.

(We see the same behavior also when we use the regular, non-stabilizer simulator with the ch gate, which is not in its basis gates).

What is the expected behavior?

Identical behavior of jobs, regardless of their origin.

yaelbh avatar Aug 04 '22 15:08 yaelbh

Specifically I have a concrete bug in qiskit-experiments right now (https://github.com/Qiskit/qiskit-experiments/issues/866) that I can test and debug only on a device. When I fix it, I'll want to add a test to our test suite, that will be able to catch the bug in the future if it returns. To overcome what I've just described, here is what I intend to do (I still don't know if it will work): for the sake of the test, I'll write a subclass of AerSimulator and a subclass of AerJob; I'll tweak their behavior to mimic devices.

This is how the hack looks like:

class MyBackend(AerSimulator):
    def run(self, run_input, **options):
        job = super().run(run_input, **options)
        job.__class__ = MyJob
        return job


class MyJob(AerJob):
    def result(self, timeout=None):
        raise QiskitError

    def status(self):
        return JobStatus.ERROR

    def error_message(self):
        return "You're dealing with the wrong job, man"

yaelbh avatar Aug 07 '22 13:08 yaelbh

I think Aer and IBMQ handles job status differently.

IBMQ: JobStatus.ERROR if experiments or Job scheduler fail Aer: JobStatus.ERROR if Job scheduler (thread pool executor) fails

I proposed a new option ibmq_semantics=True in #1575 and would like to discuss whether it is reasonable.

hhorii avatar Aug 09 '22 02:08 hhorii

Additional cases where Aer's behavior differs from devices:

  • Configuration of AerSimulator states that n_qubits is equal to 28, however Aer agrees to simulate circuits with 30 qubits.
  • When Aer is run with measurement_level set to 1, it ignores this option and runs as usual, producing counts; I'd expect an exception saying that this is not supported,

yaelbh avatar Aug 09 '22 08:08 yaelbh

As #1575 is rejected, Aer can not maintain two semantics. Also, because Aer needs to identify fails of an executor (including thread pool and DASK executor) and experiments, JobStatus.ERROR should inform only errors of executors and can not use the same semantics with IBMQ provider. I think similar semantics differences will be appeared if third party provider is used and only callers can fix them.

hhorii avatar Nov 01 '22 01:11 hhorii