``einsum`` vs ``reduce`` for several ``qibo.hamiltonians`` related cases
As discussed in #1548 there are various occasions where an einsum may be better for some backends (most likely highly parallelized ones as GPU ones) instead of the current reduce approach. I report here those alternative implementations that should be checked.
multikron
https://github.com/qiboteam/qibo/blob/09cba741e391c07910daad0faccf9d5e212b61c2/src/qibo/hamiltonians/models.py#L345
def _multikron(matrix_list, backend):
indices = list(range(2 * len(matrix_list)))
even, odd = indices[::2], indices[1::2]
lhs = zip(even, odd)
rhs = even + odd
einsum_args = [item for pair in zip(matrix_list, lhs) for item in pair]
dim = 2 ** len(matrix_list)
h = backend.np.einsum(*einsum_args, rhs)
h = backend.np.sum(backend.np.reshape(h, (-1, dim, dim)), axis=0)
return h
build spin model
https://github.com/qiboteam/qibo/blob/09cba741e391c07910daad0faccf9d5e212b61c2/src/qibo/hamiltonians/models.py#L357
def _build_spin_model(nqubits, matrix, condition, backend):
indices = list(range(2 * nqubits))
even, odd = indices[::2], indices[1::2]
lhs = zip(
nqubits
* [
len(indices),
],
even,
odd,
)
rhs = (
[
len(indices),
]
+ even
+ odd
)
columns = [
backend.np.reshape(
backend.np.concatenate(
[
matrix if condition(i, j) else backend.matrices.I()
for i in range(nqubits)
],
axis=0,
),
(nqubits, 2, 2),
)
for j in range(nqubits)
]
einsum_args = [item for pair in zip(columns, lhs) for item in pair]
dim = 2**nqubits
h = backend.np.einsum(*einsum_args, rhs, optimize=True)
h = backend.np.sum(backend.np.reshape(h, (nqubits, dim, dim)), axis=0)
return h
SymbolicTerm.matrix
https://github.com/qiboteam/qibo/blob/09cba741e391c07910daad0faccf9d5e212b61c2/src/qibo/hamiltonians/terms.py#L203
def matrix(self):
# find the max number of matrices for each qubit
max_len = max(len(v) for v in self.matrix_map.values())
nqubits = len(self.matrix_map)
# pad each list with identity to max_len
matrix = []
for qubit, matrices in self.matrix_map.items():
matrix.append(
self.backend.np.concatenate(
self.matrix_map[qubit]
+ (max_len - len(matrices)) * [self.backend.np.eye(2)],
axis=0,
)
)
# separate in `max_len`-column tensors of shape (`nqubits`, 2, 2)
matrix = self.backend.np.transpose(
self.backend.np.reshape(
self.backend.np.concatenate(matrix, axis=0), (nqubits, max_len, 2, 2)
),
(1, 0, 2, 3),
)
indices = list(zip(max_len * [0], range(1, max_len + 1), range(2, max_len + 2)))
lhs = zip(matrix, indices)
lhs = [el for item in lhs for el in item]
matrix = self.backend.np.einsum(*lhs, (0, 1, max_len + 1))
return self.coefficient * _multikron(matrix, self.backend)
multikron
Just to link the outcome of an investigation: https://github.com/qiboteam/qibo/pull/1548#issuecomment-2648375760.
In this specific case, maybe an einsum is not cleaner, nor necessarily faster.
But this is specifically happening just because the operation in the for loop is increasing exponentially in complexity (including memory), to the point that the last iteration dominates the whole process (or, at most, the last few ones). So, exploring einsum (and its optimization) may be still perfectly worth for all or most of the others.