Add N-d sum to zero

Open spinkney opened this issue 7 months ago • 0 comments

I'm preparing the docs for the sum-to-zero matrix and it's essentially a nested version of the 1-d sum-to-zero constraint. I thought that there must be an easier way to generalize everything to higher order dimensions. In fact there is and it's even better than I originally thought. We can re-use the 1-d constraint for everything.

I did write this up in cpp that you can see in Godbolt (here) but I find the R code to be easier to understand.

N-d zero-sum background

Our 1-d sum-to-zero is

zero_sum_helmert <- function(y) {
    N <- length(y)
    z <- rep(0, N + 1)
    sum_w <- 0

    for (ii in 1:N) {
        i <- N - ii + 1
        n <- i
        w <- y[i] / sqrt(n * (n + 1))

        sum_w <- sum_w + w
        z[i] <- z[i] + sum_w
        z[i + 1] <- z[i + 1] - w * n
    }
    z
}

To do an N-d the code is

# Row-major strides & flat-index helpers
make_strides <- function(dims) {
    D <- length(dims)
    stride <- integer(D)
    stride[D] <- 1
    for (d in (D - 1):1) stride[d] <- stride[d + 1] * dims[d + 1]
    stride
}

flat_index <- function(idx0, stride) {
    ## idx0 is *zero-based*; add 1 for R’s one-based vectors
    sum(idx0 * stride) + 1L
}

# General N-D zero-sum expansion
zero_sum_nd <- function(x, type = c("helmert", "householder"), dims = NULL) {
    Z <- x
    cur <- dims # current extents
    type <- match.arg(type)

    for (ax in seq_along(dims)) {
        next_dims <- cur
        next_dims[ax] <- next_dims[ax] + 1L
        Z_next <- numeric(prod(next_dims))

        stride_in <- make_strides(cur)
        stride_out <- make_strides(next_dims)

        ## loop over every index-tuple except the current axis
        lead_total <- prod(cur[-ax])
        lead_idx <- integer(length(cur))

        for (flat in 0:(lead_total - 1L)) {
            # decode flat -> lead_idx
            rem <- flat
            for (d in rev(setdiff(seq_along(cur), ax))) {
                lead_idx[d] <- rem %% cur[d]
                rem <- rem %/% cur[d]
            }

            # pull the 1-D slice
            slice <- numeric(cur[ax])
            for (k in 0:(cur[ax] - 1L)) {
                lead_idx[ax] <- k
                slice[k + 1L] <- Z[flat_index(lead_idx, stride_in)]
            }
           
            # any orthgonal type zero-sum tranform works here!
            z <- zero_sum_helmert(slice)

            # write back
            for (k in 0:(cur[ax])) {
                lead_idx[ax] <- k
                Z_next[flat_index(lead_idx, stride_out)] <- z[k + 1L]
            }
        }
        Z <- Z_next
        cur <- next_dims
    }
    Z
}

This works because the constraint is a projection onto the 1-dimensional span of a row vector of 1s, is square, and orthogonal. Applying the constraint sequentially over each tensor slice (often called a fiber) works because of the orthonormal property of the transform we are using.

We can actually confirm that this is the result of the orthogonal transform by running basis vectors over each dimension through the transform and then seeing if the resultant basis matrix is orthogonal.

make_basis <- function(input_dims, fun, type = NULL) {
    dim_prod <- prod(input_dims)
    out_size <- prod(input_dims + 1)
    B <- matrix(0, out_size, dim_prod) # each column will be one basis vector
    type <- match.arg(type, choices = c("helmert", "householder"))

    if (as.character(substitute(fun)) == "zero_sum_nd") {
        dims <- input_dims
    } else {
        dims <- NULL
    }

    for (j in 1:dim_prod) { # feed unit vectors through the loop
        e <- numeric(dim_prod)
        e[j] <- 1
        X <- array(e, dim = input_dims)
        Z <- fun(X, type, dims)
        B[, j] <- as.vector(Z)
    }
    B
}

B_tensor <- make_basis(c(3, 1), fun = zero_sum_nd, type = "helmert")
# this should be identity if orthogonal basis
zapsmall(crossprod(B_tensor))

Proposal

Since we do not have any N-d array constraint types I think it would be easiest to initially include this as part of the constrain transform functions (forthcoming). The user would specify an N-d array where the size of each dimension is M - 1 (where M itself is an N-d array) in the parameters block and feed this to the constraint transform which returns an N-d array of length M.

parameters {
  array[4, 5, 6, 7] real y;
}
transformed parameters {
 array[5, 6, 7, 8] real z = zero_sum_array_constraint(y);
}

It's annoying to type out the array dimensions but I'm not sure if we can feed an array in for the dimension sizes or have some syntactic sugar like

array[sizes = my_array_sizes] real y;

May 12 '25 15:05 spinkney