qiskit-machine-learning
qiskit-machine-learning copied to clipboard
Implement QKMeans
Summary
Implementation of Quantum K Means as discussed in #232.
Details and comments
This is an initial implementation. Would appreciate your feedback to make any required changes.
Hi, @prakharb10 why is it necessary that the backend is an input argument? didn't you have problems with the PEP-8 style about the names of some variables (q, qc, ...) or the length of the get_states function?
Thanks for the PR. It really looks good.
But the algorithm itself has a crucial limitation that makes it not so interesting for the implementation. I studied the cited paper and they state on the page 34 "The measurement during swap test is causing the deconherence of input states, thus the quantum memory should contain multiple copies of input states". Since we don't have quantum memory and there are no preconditions for such thing, quantum states have to be prepared every time they are required. In the algorithm they use amplitude encoding, take a look at the formula (129) on the page 34. In the implementation you have the same approach, on the line 232 of qkmeans.py you have init_state = Initialize(x). This is exactly what is supposed to do in the algorithm. Nothing wrong here, but on a large number of features such initialization becomes a problem. It scales exponentially, thus, if we consider other terms in the complexity of the algorithm, it leads to the same time complexity of the algorithm as the classical clustering algorithm has. Basically, quantum state preparation compensates all the possible benefits we may obtain from computing distances on a quantum computer. Considering the limitation, I don't think the implementation may be recognized as a contribution, unfortunately.
On the other hand, the implementation looks nice and I'd like to keep it as an example. The implementation can be a separate package. Also, there's an evolving process of setting up a community/ecosystem for the tools and software that depend on Qiskit. Once we have such a community, this implementation may become a member of it.
Let me know if you have questions. @stefan-woerner this is for your information.
Hi, just two additional cents here. You don't need to calculate an exact distance between centroids and points, all you need is a metric that grows as the distance does. It can help to save a few operations (despite it does not change the complexity order). Here you are an alternative version of the distance calculation function:
def DistCalc(backend: Backend, a: list, b:list, shots: int = 1024): psi, phi = get_states(a, b) qc = SwapTest(psi, phi) job = execute(qc, backend, shots=shots) result = job.result() counts = result.get_counts(qc) dist = counts['0'] return dist
The example is good, but does not fit well into Qiskit Machine Learning. I'd be appreciated if you moved this PR/code to a dedicated repo and contributed it to the Qiskit Ecosystem, see here: https://qiskit.org/ecosystem/ Let me know if you have questions.