tensorly Randomised_CP function throws a Singular Matrix error

Steps or Code to Reproduce

import tensorly as tl
import ivy
tl.set_backend("numpy")
a = tl.ones((2,3,4))
z = tl.decomposition.randomised_parafac(a, 2, 1, n_iter_max=1, init="svd", max_stagnation=1, tol=0.03125, random_state=1)
print(z)

Error

Traceback (most recent call last):
  File "/workspaces/ivy/test2.py", line 7, in <module>
    z = tl.decomposition.randomised_parafac(a, 2, 1, n_iter_max=1, init="svd", max_stagnation=1, tol=0.03125, random_state=1)
  File "/opt/miniconda/envs/multienv/lib/python3.10/site-packages/tensorly/decomposition/_cp.py", line 726, in randomised_parafac
    factor = tl.transpose(tl.solve(pseudo_inverse, factor))
  File "/opt/miniconda/envs/multienv/lib/python3.10/site-packages/tensorly/backend/__init__.py", line 206, in wrapped_backend_method
    return getattr(
  File "/opt/fw/mxnet/numpy/linalg/linalg.py", line 409, in solve
    r = gufunc(a, b, signature=signature, extobj=extobj)
  File "/opt/fw/mxnet/numpy/linalg/linalg.py", line 112, in _raise_linalgerror_singular
    raise LinAlgError("Singular matrix")
numpy.linalg.LinAlgError: Singular matrix

The snippet runs successfully when rank==1, however fails for all other values. Is there a bound on the value of rank for a given tensor?

Versions

numpy version : 1.25.2 -->

Oct 11 '23 02:10 hello-fri-end

Is there a bound on the value of rank for a given tensor? Hi! Here I am offering you my findings on this problem: The parameter R you referred to is the number of components, i.e. rank-one tensors, you would like to have in the result of your CP decomposition. In the given reference of randomised_parafac(), (Casey Battaglino, Grey Ballard and Tamara G. Kolda, "A Practical Randomized CP Tensor Decomposition"), section 3.1, there was a discussion on the smallest number of rank-one tensors in an exact CP decomposition, i.e. the tensor rank. Here I am citing a few relevant points:

There is no straightforward algorithm to determine the rank of a specific tensor. In fact, the problem is NP-hard.
In practice, the rank of a tensor is determined numerically by fitting various rank-R CP models. (In section 3.4) Ideally, if the data is noise-free, and we have a procedure for calculating CP with a given number of components, then we can do that for R=1,2,3… and stop at the first value of R that gives a fit of 100%. However, there are many problems with this procedure. … When the data are noisy (as is frequently the case), then fit alone cannot determine the rank in any case; …
ALS, today's workhorse method to compute CP at a given R, can take many iterations to converge, and is not guaranteed to converge to the global minimum or even a stationary point. The author also summarized typical ranks of three-way tensors over R space with or without frontal slice symmetry.

(Btw, I checked the arXiv paper again today and could not find the tensor rank section any more. Not sure why but I am trying to find the version I read a few days ago.)

Regarding the singular matrix error, specifically

factor = tl.transpose(tl.solve(pseudo_inverse, factor))

In numpy/linalg/linalg.py

def solve(a, b):
    """
    Solve a linear matrix equation, or system of linear scalar equations.

    Computes the "exact" solution, `x`, of the well-determined, i.e., full
    rank, linear matrix equation `ax = b`.
    …
    """

It seems that the singular matrix error occurs when a in ax=b is not invertible, which in this case means pseudo_inverse is not invertible. We confirm this by inspecting the determinant of pseudo_inverse:

############## START
R = 2
############## ITERATE
======== iteration 0
------------- mode 0
det of pseudo_inverse: 4.791666666666669
------------- mode 1
det of pseudo_inverse: 4.397817528932697e-29
------------- mode 2
det of pseudo_inverse: 0.0
Traceback (most recent call last):
...
numpy.linalg.LinAlgError: Singular matrix

Inspecting the calculation steps:

711             kr_prod, indices_list = sample_khatri_rao(                          
712                 factors, n_samples, skip_matrix=mode, random_state=rng          
713             )
…
724             pseudo_inverse = tl.dot(tl.transpose(kr_prod), kr_prod)

There can be multiple causes for det(pseudo_inverse) = 0, analytical or numerical. Analytically, kr-prod is the sampled khatri-rao product of size (n_samples, R), and pseudo_inverse = tl.dot(tl.transpose(kr_prod), kr_prod) is of size (R, R). If we want pseudo_inverse to be invertible, kr_prod should have R non-zero eigenvalues. Numerically, one possible reason is that column normalization of the updated factors is currently not implemented in randomised_parafac(). I'm not sure why this step is skipped.

Oct 27 '23 18:10 chaoyihu

chaoyihu is right about what the issues is here. I'll elaborate a bit further.

The issue is not with the choice of rank, but rather the choice of number of samples (n_samples) which is the 3rd parameter in randomised_parafac. In the ALS algorithm for CP decomposition, each step involves solving a highly overdetermined linear least squares problem, i.e., something of the form $\min_X || AX - B ||$ where $A$ is tall and skinny. What randomised_parafac does is draw a subset of the rows in this linear system uniformly at random. The number of rows that are drawn is determined by n_samples. When this is set to 1 as in your example, only a single row is drawn. When $R=1$, this doesn't break the code since you end up with a $1 \times 1$ linear system which is (typically) invertible. However, for $R > 1$, you'll end up with an underdetermined linear system.

To see why this is a problem, suppose $S$ is the sampling operator. Then, the sampled least squares system is $\min_X || SAX - SB||$. As implemented, randomised_parafac solves this via the normal equations, i.e., by reformulating it to the problem of solving $(SA)^\top (SA) X = (SA)^\top B$ for $X$. As chaoyihu points out, this system is solved by numpy's solve function which can only handle full-rank problems, i.e., when $(SA)^\top (SA)$ is of full rank. If n_samples < rank, this matrix will always be rank-deficient and solve won't work.

In practice, since the sampling is done uniformly at random, even if n_samples $\geq$ rank, there's a chance that the system $(SA)^\top (SA)$ is rank deficient. It is therefore a good idea to choose n_samples to be quite a bit bigger than rank.

As a side note, randomised_parafac doesn't actually implement any of the methods by Battaglino et al. That paper uses randomized trigonometric transforms to first mix the rows before sampling. The way randomised_parafac is implemented currently, it's easy to construct an adversarial example which breaks the method (which would give the kind of singular matrix error you noticed) with high likelihood even when a large number (say, half) of all rows are included in the linear least squares problems.

Feb 24 '24 21:02 OsmanMalik

@OsmanMalik thanks for the comment! Could you elaborate on the issue with the randomized parafac? Would be awesome if you are able to add a fix! :)

Jul 08 '24 09:07 JeanKossaifi

@OsmanMalik thanks for the comment! Could you elaborate on the issue with the randomized parafac? Would be awesome if you are able to add a fix! :)

@JeanKossaifi Suppose for example that the design matrix $A$ is tall-and-skinny with the following properties: $a_{11} = 1$, $a_{1j} =0$ for $j > 1$, $a_{i1} = 0$ for $i > 1$, the other entries are i.i.d. Gaussian (or anything that makes the submatrix $(a_{ij})_{i>1, j>1}$ full rank). Then, $A$ is full-rank. Any sampling operator that fails to draw the top row of $A$ will result in a rank-deficient sampled design matrix (first column of $SA$ will be zero), and hence in general the solution to the sampled linear system will be inaccurate. To ensure we include the top row with probability at least, say, 0.5 we need to draw at least half of all rows of $A$ when doing uniform sampling.

That's how the construction of the adversarial example goes. Of course, in practice things usually end up being a lot nicer than an adversarial example, so uniform sampling should work adequately and perform more closely to a sampler with theoretical guarantees like those based on leverage score sampling or fast trig transforms.

Would love to add a fix, but have limited time these days. Will take a stab if I find the time!

Jul 09 '24 03:07 OsmanMalik

tensorly tensorly copied to clipboard

Randomised_CP function throws a Singular Matrix error

Steps or Code to Reproduce

Error

Versions

tensorly
tensorly copied to clipboard