qrand icon indicating copy to clipboard operation
qrand copied to clipboard

Stable bounded factorization algorithm

Open pedrorrivero opened this issue 3 years ago • 3 comments

Is your feature request related to a problem? Please describe.

QiskitPlatform splits job repetitions into shots and experiments. Therefore, in order to build a QiskitJob, we need to factor any user input number of repetitions into a number of shots and a number of experiments such that shots * experiments = repetitions. Also, we need to take into account the fact that these two amounts are bounded by max_shots and max_experiments respectively.

Because of this last restriction, said task can be impossible to perform: for instance, if repetitions is a prime number larger than both bounds. To address this issue, we would like to find the factorization which is closest to the actual number of repetitions requested without exceeding it.

Describe the solution you'd like

Representing the solution algorithm as a function, this problem can be mathematically modeled as follows:

Find f: ℕ³ → D ⊆ ℕ² such that f(n, A, B) = (a, b) minimizes n-a•b, where:

  1. a•b ≤ n < A•B
  2. a ≤ A < n and b ≤ B < n
  3. f(a•b, A, B) = (a', b') : a•b = a'•b'

Notice how we require A < n otherwise the trivial solution (a=n, b=1) would be available. The same applies to B < n. Additionally, we require n < A•B to scape the trivial solution (a=A, b=B).

The last (stability) condition is needed to guarantee performance (i.e. running the algorithm iteratively does not worsen the approximation). Notice that this condition is automatically satisfied if the function always returns an optimal solution. However, since (a, b) ∈ D —as opposed to being general natural numbers— this condition is, in principle, less restrictive than asking for perfect factoring of any given composite number.

By convention, we assume zero to not be contained in the natural numbers (ℕ).

Describe alternatives you've considered

Due to symmetry considerations, we can assume the first bound to be lower than the second without loss of generality. Otherwise, we would only need to swap them.

A naive approach to solving this problem would be as follows:

def g(
    n: int, bound_A: int, bound_B: int
) -> Tuple[int, int]:
    if not bound_A * bound_B > n:
        return bound_A, bound_B
    swapped: bool = bound_A > bound_B
    bound_A, bound_B = sorted([bound_A, bound_B])
    a = min(bound_A, n // bound_B + 1)
    b = min(bound_B, n // a)
    return (b, a) if swapped else (a, b)

However, this violates (3): as displayed for n=11, A=4, B=5 which returns (a=3, b=3). Updating n=3•3 it follows (a'=2, b'=4), so we have 11 > 3•3 > 2•4. Nonetheless, this process seems to converge, so a workaround would be to repeat the process until the output stabilizes.

def f(
    n: int, bound_A: int, bound_B: int
) -> Tuple[int, int]:
    last_a, last_b = g(n, bound_A, bound_B)
    a, b = g(last_a * last_b, bound_A, bound_B)
    while a * b != last_a * last_b:
        last_a, last_b = a, b
        a, b = g(n, bound_A, bound_B)
    return a, b

In order to implement this approach we would need:

  1. Formal proof of fast convergence
  2. Show that the result minimizes n-a•b in some acceptable way, even if suboptimal*

The algorithm currently used finds the optimal solution through exhaustive search:

def compute_bounded_factorization(
    n: int, bound_A: int, bound_B: int
) -> Tuple[int, int]:
    if bound_A * bound_B < n:
        return bound_A, bound_B
    swapped: bool = bound_A > bound_B
    bound_A, bound_B = sorted([bound_A, bound_B])
    final_delta: int = n
    b: int = bound_B
    a: int = n // b
    delta: int = n - a * b
    while a <= bound_A and a <= b and final_delta != 0:
        if delta < final_delta:
            final_a, final_b, final_delta = a, b, delta
        a += 1
        b = n // a
        delta = n - a * b
    return (final_b, final_a) if swapped else (final_a, final_b)

Additional context

Due to the nature of the qiskit backends experiments are more expensive than shots, so we would ideally want to keep the number of experiments as low as possible if several equally good, or even optimal, solutions exist.

Although we do not provide a formal proof, we believe this problem to be beyond NP. This is so because it reduces to the factoring problem for A = B = n-1, when n is the product of only two primes. Additionally, verifying an answer does not seem to be efficient either, since we do not know a way of finding the optimal gap n-a•b without going through the same process required for solving the problem to begin with.

*Therefore, we are open to considering suboptimal solutions that are capable of improving performance (i.e. speed) significantly.

pedrorrivero avatar Apr 08 '21 00:04 pedrorrivero