sage unitary DFT for symmetric group algebra

trafficstars

:memo: Checklist

[x] The title is concise and informative.
[x] The description explains in detail what this PR is about.
[x] I have linked a relevant issue or discussion.
[x] I have created tests covering the changes.
[x] I have updated the documentation and checked the documentation preview.

:hourglass: Dependencies

Jul 30 '24 20:07 jacksonwalters

https://github.com/sagemath/sage/issues/38456

Jul 30 '24 20:07 jacksonwalters

Documentation preview for this PR (built with commit bf00417fa3ad87c38ecb4f03abcacdbdd14d691a; changes) is ready! :tada: This preview will update shortly after each push to this PR.

Jul 30 '24 22:07 github-actions[bot]

To handle extensions of \Q, we can work over a number field containing all necessary square roots. Currently there is some redundancy. I compute the diagonalizations to know which roots to take, then compute them again when computing Fourier coefficients.

Jul 31 '24 18:07 jacksonwalters

getting environment error "pytest is not installed in the venv, skip checking tests that rely on it Error: Process completed with exit code 1." added ~3 lines to handle case when diagonal contains higher degree algebraic numbers (n=5 and above).

Aug 08 '24 19:08 jacksonwalters

relevant MSE questions:

Aug 08 '24 19:08 jacksonwalters

I checked that this builds if merged over the latest beta. Could you fix "This branch is out-of-date with the base branch" ? The interactive rebase feature as provided by GitHub is reasonably easy to understand and work with.

Oct 21 '24 22:10 dimpase

That isn’t necessary until it is fully reviewed and only as a final check with the most recent changes.

Oct 22 '24 00:10 tscrim

These rebases unbreak the CI - so that's the main reason to do it.

Oct 22 '24 04:10 dimpase

It looks to me like the CI tests are passing (one skipped) so far.

Oct 23 '24 00:10 jacksonwalters

Work in issue 38456 is sufficient to construct this DFT. We compute $G$-invariant symmetric bilinear forms, factor the associated matrix $U=AA^\ast$, and obtain unitary representations $\tilde{\rho}(g)=A^\ast\rho(g)(A^{\ast})^{-1}$. The Fourier coefficients are $\hat{f}(\rho) = \sqrt{\frac{d_\rho}{|G|}}\sum_g f(g)\tilde{\rho}(g)$. However, $DFT.DFT^\ast = S$ where $S$ is a diagonal matrix consisting of signs, +1's and -1's. Factoring $S=RR^*$ with diagonal $R$, $uDFT = R^{-1}.DFT$ is unitary. Note $c=zz^\ast$ has a unique solution $z \in GF(q^2)$ when $c \in GF(q)$.

Dec 05 '24 18:12 jacksonwalters

So the Forms package is a GAP package that has been included since GAP v4.9, but it is not included by our default installation nor in gap_packages.

However, I am thinking we should just reproduce their algorithm directly to take advantage of Sage's linear algebra packages. The only function really used is the BaseChangeToCanonical, and from looking at the source code, it seems like they are just running Gram-Schmidt by Gaussian elimination.

What do you think?

Dec 06 '24 01:12 tscrim

So the Forms package is a GAP package that has been included since GAP v4.9, but it is not included by our default installation nor in gap_packages.

However, I am thinking we should just reproduce their algorithm directly to take advantage of Sage's linear algebra packages. The only function really used is the BaseChangeToCanonical, and from looking at the source code, it seems like they are just running Gram-Schmidt by Gaussian elimination.

What do you think?

That's unfortunate that it's not included by default. I don't know what the barriers are to getting it incorporated, or how long that would take. It would certainly make things easy.

I agree. We'd been talking about doing this anyways, and I'm happy to go ahead and just rewrite it in Sage. I'll try to get a basic version working tomorrow. Then there will probably be ways to optimize it.

Dec 06 '24 01:12 jacksonwalters

I believe it should just be getting the echelon form of the appropriately augmented Gram matrix of the initial sesquilinear form and then taking the appropriate submatrix (possibly transposed). So it should just be a few lines of code (at most).

Dec 06 '24 03:12 tscrim

I believe it should just be getting the echelon form of the appropriately augmented Gram matrix of the initial sesquilinear form and then taking the appropriate submatrix (possibly transposed). So it should just be a few lines of code (at most).

I agree it should a lot simpler than what is written in GAP. However, a straightforward augmentation and row reduction doesn't yield the correct matrix. For example, when q=11 and la=[2,1,1] with n=4, we have

q = 11
F = GF(q**2)
U=matrix(F,[[1,4,7],[4,1,4],[7,4,1]])
#use libgap.eval for GAP evalutation of BaseChangeToCanonical using `forms` package
def unitary_change_of_basis(U,q):
    if U.nrows() == 1 and U.ncols() == 1:
        return matrix(F,[[factor_scalar(U[0,0])]])
    loaded_forms = libgap.LoadPackage("forms")
    return matrix(F,libgap.BaseChangeToCanonical(libgap([list(row) for row in U]).HermitianFormByMatrix(F))).inverse()

unitary_change_of_basis(U,q)
[        1         0         0]
[        4  9*z2 + 2         0]
[        7 10*z2 + 1  3*z2 + 3]

which is correct. However,

def base_change_hermitian(U):
    F = U.base_ring()
    augmented_mat = U.augment(identity_matrix(F, U.nrows()))
    echelon_form = augmented_mat.echelon_form()
    return echelon_form[:, U.ncols():]

base_change_hermitian(U)
[7 2 9]
[2 7 2]
[9 2 7]

which is just $U^{-1}$. I'm not sure how to properly augment the matrix. Note also in the linked source code there is the block

if not IsOne(a) then
   # find an b element with norm b*b^t = b^(t+1) equal to a
   b := RootFFE(gf, a, t+1);
    Forms_MultRow(D,i,1/b);
fi;

which suggests at some point we should be factoring scalars as $a=bb^\ast$.

Dec 06 '24 03:12 jacksonwalters

Okay, a little bit more care is needed here since it does the RREF. We only want to operate on the strictly lower (or upper) triangle of the matrix as the other part is taken care of by the symmetry. So a short version is to use the inverse of the upper part of the LU decomposition:

sage: U = matrix(F,[[1,4,7],[4,1,4],[7,4,1]])
sage: Up = U.LU()[2]
sage: D = Up.diagonal()
sage: A = ~Up * matrix.diagonal([d.sqrt() for d in D])
sage: A.H * U * A
[ 1  0  0]
[ 0 10  0]
[ 0  0 10]

Of course, this is not very efficient since it is computing far more things than necessary. However, it is easy to code...

Dec 06 '24 05:12 tscrim

Mathematically it's not echelon form, and not even Gram-Schmid, as you have a twist by the Galois automorphism in the factorisation.

Dec 06 '24 05:12 dimpase

Mathematically it's not echelon form, and not even Gram-Schmidt, as you have a twist by the Galois automorphism in the factorisation.

For what it's worth, here is a direct translation of the GAP code into Sage which appears to be working now. Perhaps we can reduce it, but I agree with Dima that the twist makes it something slightly different.

#for u in GF(q), we can factor as u=aa^* using gen. z and modular arithmetic
def conj_square_root(u):
    if u == 0:
        return 0  # Special case for 0
    z = F.multiplicative_generator()
    k = u.log(z)  # Compute discrete log of u to the base z
    if k % (q+1) != 0:
        raise ValueError("Unable to factor: u is not in base field GF(q)")
    return z**((k//(q+1))%(q-1))

def base_change_matrix(mat):
    """
    Diagonalizes a Hermitian matrix over a finite field.
    Returns the base change matrix and the rank of the Hermitian form.
    
    Arguments:
        mat: The Gram matrix of a Hermitian form (Sage matrix object)
        gf: The finite field (GF(q))
    
    Returns:
        D: The base change matrix
        r: The number of non-zero rows in D*mat*D^T
    """

    gf = mat.base_ring()
    n = mat.nrows()
    q = gf.order()
    t = sqrt(q)

    A = copy(mat)
    D = identity_matrix(gf, n)
    row = 0

    # Diagonalize A
    while True:
        row += 1

        # Look for a non-zero element on the main diagonal, starting from `row`
        i = row - 1  # Adjust for zero-based indexing in Sage
        while i < n and A[i, i].is_zero():
            i += 1

        if i == row - 1:
            # Do nothing since A[row, row] != 0
            pass
        elif i < n:
            # Swap to ensure A[row, row] != 0
            A.swap_rows(row - 1, i)
            A.swap_columns(row - 1, i)
            D.swap_rows(row - 1, i)
        else:
            # All entries on the main diagonal are zero; look for an off-diagonal element
            i = row - 1
            while i < n - 1:
                k = i + 1
                while k < n and A[i, k].is_zero():
                    k += 1
                if k == n:
                    i += 1
                else:
                    break

            if i == n - 1:
                # All elements are zero; terminate
                row -= 1
                r = row
                break

            # Fetch the non-zero element and place it at A[row, row + 1]
            if i != row - 1:
                A.swap_rows(row - 1, i)
                A.swap_columns(row - 1, i)
                D.swap_rows(row - 1, i)

            A.swap_rows(row, k)
            A.swap_columns(row, k)
            D.swap_rows(row, k)

            b = A[row, row - 1]**(-1)
            A.add_multiple_of_row(row - 1, row, -b**t)
            A.add_multiple_of_column(row, row - 1, -b)
            D.add_multiple_of_row(row - 1, row, -b)

        # Eliminate below-diagonal entries in the current column
        a = -A[row - 1, row - 1]**(-1)
        for i in range(row, n):
            b = A[i, row - 1] * a
            if not b.is_zero():
                A.add_multiple_of_column(i,row - 1, b**t)
                A.add_multiple_of_row(i, row - 1, b)
                D.add_multiple_of_row(i, row - 1, b)

        if row == n - 1:
            break

    # Count how many variables have been used
    if row == n - 1:
        if not A[n - 1, n - 1].is_zero():
            r = n
        else:
            r = n - 1

    # Normalize diagonal elements to 1
    for i in range(r):
        a = A[i, i]
        if not a.is_one():
            # Find an element `b` such that `b*b^t = b^(t+1) = a`
            b = conj_square_root(a)
            D.rescale_row(i, 1 / b)

    return D

Dec 06 '24 05:12 jacksonwalters

Mathematically it's not echelon form, and not even Gram-Schmid, as you have a twist by the Galois automorphism in the factorisation.

I am pretty sure the first step of computing an orthogonal basis can be done by Gram-Schmidt. If it wasn’t, then it wouldn’t work over $\mathbb{C}$, where dot product is sesquilinear. (Although one has to be a little more careful about which side of the inner product one puts the vectors in.) In particular, when you compute the orthogonality with the new vector at each step, you can put it in the side that does not have the automorphism.

Now for normalizing the diagonal entries we have to be more careful since we need to solve $zz^* = d$, but that also has a known solution.

Dec 06 '24 06:12 tscrim

Here is a complete method for obtaining the decomposition $U=AA^\ast$. It does work, but I don't understand why this is different than the result of the other two methods. Do we know if this decomposition is unique? Seems to be in the 1 dim'l case.

def base_change_hermitian(U):
    Up = U.LU()[2]
    D = Up.diagonal()
    A = ~Up * matrix.diagonal([d.sqrt() for d in D])
    diag = (A.H * U * A).diagonal()
    factor_diag = diagonal_matrix([conj_square_root(d) for d in diag])
    return (factor_diag*A.inverse()).H

#use the upper part of the LU decomposition, and factor the diagonal
U=matrix(F,[[1,4,7],[4,1,4],[7,4,1]])
B = base_change_hermitian(U)
print(B*B.H == U)
print(B)
True
[   1    0    0]
[   4 8*z2    0]
[   7 4*z2 2*z2]

Dec 06 '24 07:12 jacksonwalters

If you do an $X = LDU$ decomposition of the matrix $X$, the $L^{-1}, U^{-1}$ is recording the steps of the Gaussian elimination (assuming no permutation). Indeed $D = L^{-1} X U^{-1}$. Now if $X$ is Hermitian, than $L = U^{\ast}$. The matrices $L$ and $U$ are change of basis matrices for the codomain and domain of $X$ (which in this case, is the same vector space $V$). Now the sesquilinear form is given by $X$ under a change of basis of $V$, say $A$, is $A^{\ast} X A$. Therefore, if we consider $A = U^{-1}$, then $A^* X A = L^{-1} X U^{-1} = D$. So here, we only want to run Gaussian elimination on $(X|I)$ until we get a row echelon form, yielding $(DU|L)$.

What is different than before is that computing echelon_form runs until it gets the rref, which uses both upper and lower triangular matrices and has $(X|I) \to (I|X^{-1})$. So we end up with $(X^{-1})^* X X^{-1} = (X^{-1})^*$ if we try to change the basis from the rref computation.

Dec 06 '24 07:12 tscrim

@tscrim Any idea what’s going on with the two failing checks (macos 3.11, meson/ conda python 3.11)? They were all passing before I squashed commits. Is there a way to just re-run the checks? The solutions here seem aggressive.

Dec 06 '24 18:12 jacksonwalters

I usually just ignore those…

Dec 07 '24 12:12 tscrim

I don't mind ignoring them, but don't they need to pass to merge?

Dec 07 '24 16:12 jacksonwalters

They can be flaky. You can look at the logs and see that it isn’t anything related to your PR.

Dec 07 '24 22:12 tscrim

Here's the error from the macos-latest, 3.11 log:

======================== 58 passed, 5 skipped in 59.55s ========================
Error: Process completed with exit code 1.

Here's the one for Meson / Conda, Python 3.11:

=================== 61 passed, 2 skipped in 67.85s (0:01:07) ===================
Error: Process completed with exit code 64.

I don't think it has anything to do with the PR code.

Separately, I do need to add an assertion requiring $p \nmid |G|$ since we are using the normalization factors $\sqrt{d_\rho/|G|}$. This avoids the situation when the space of $G$-invariant bilinear forms has dimension greater than one, e.g. q=2 and la=[3,1,1] for n=5.

This is not necessarily a bad thing, it's more a deferral to a time when better understand this case, which I would call the intersection of the modular and unitary DFT. Probably we'll have make use of two are more linearly independent forms, and I don't know how to do that (yet).

I'm also removing conjugate_pos_char. It's not necessary; one can just use .H.transpose().

Dec 09 '24 21:12 jacksonwalters

@tscrim Is it worth throwing a warning on line 2181 if len(U_mats) > 1? In that we are later choosing U_mats[0]. Again, this will only happen in the modular case we are avoiding, but still.

Dec 09 '24 21:12 jacksonwalters

You didn’t need to copy the log; there is no useful info really in it. (To my point, the last push reran the tests, which now pass.)

There is a conjugate() method for matrices IIRC. There is no point in transposing the matrix twice. :)

Rather than raise a warning, I would simply error out with a NotImplementedError since the current code doesn’t deal with that situation.

Dec 09 '24 23:12 tscrim

Got rid of the log messages. Yes, looks like it was just flaky as you said. They are passing now.

Good to know! Not sure how I missed conjugate(). Updated.

I'm now raising a NotImplementedError for the modular case $p \mid |G|$. I'm also including the two key failure points in the error string:

that the space of invariant forms can have dimension greater than one
the normalization factor is not defined.

It's a bit extra, but for someone trying to extend this it points them to the right place (look at how the modular DFT is performed, and try to adapt it).

Dec 10 '24 00:12 jacksonwalters

@tscrim Changes have been implemented. I was able to remove all occurrences of flatten, but I don't see how to get rid of transpose().

I went back and looked at the DFT in characteristic zero. I noticed that if U=SGA.dft(), then U*U.H = D, where D is diagonal. That was the original problem, to find a unitary DFT, which is solved in the first conditional.

However, it seems we're now comfortable just factoring this resultant diagonal, and multiplying by its inverse to get a unitary DFT. In fact,

#an alternate method to create a unitary DFT in characteristic zero, avoiding other computations
U_dft_sqrt_diag = diagonal_matrix([sqrt(d) for d in (SGA_dft*SGA_dft.transpose()).diagonal()]).inverse()*SGA_dft
U_dft_sqrt_diag*U_dft_sqrt_diag.H == identity_matrix(G.cardinality())
True

I'm a bit conflicted, because I feel it's illuminating to see how these number fields are actually constructed. Then again, if the resulting square roots are just those on the diagonal of $UU^\ast$, perhaps that's saying the same thing.

It turns out to work for the positive characteristic case, this time factoring $D=RR^\ast$, so each $d=rr^\ast$.

#an alternate way of computing the unitary DFT
dft_matrix = SGA.dft()
diag = (dft_matrix*dft_matrix.H).diagonal()
factor_diag_inv = diagonal_matrix([~conj_square_root(d) for d in diag])
U_dft = factor_diag_inv*dft_matrix
U_dft*U_dft.H == identity_matrix(len(G))
True

Do we want to replace the current code with these simple computations? Then perhaps we could move the existing code to compute unitary representations, rather than using them as an intermediary step for the DFT?

Dec 10 '24 07:12 jacksonwalters

I think the bulk of this code should be moved to unitary representations (e.g. as another option like orthogonal). Then we can compute unitary rep'ns independently, and use them here if desired.

I'm now in favor of just using the brief code which factors the resultant diagonal for the DFT, skipping the use of unitary rep'ns entirely.

Dec 10 '24 18:12 jacksonwalters

sage sage copied to clipboard

unitary DFT for symmetric group algebra

:memo: Checklist

:hourglass: Dependencies

sage
sage copied to clipboard