matlib icon indicating copy to clipboard operation
matlib copied to clipboard

Argument to have GramSchmidt return matrix having same dimension as input

Open ggrothendieck opened this issue 2 years ago • 15 comments

Currently GramSchmidt contains the line below which removes the zero columns

B <- B[, !apply(B, 2, function(b) all(b == 0))]

however, that makes it hard to relate the result to the input matrix. Suggest that there be an argument to skip over that line so that the output columns directly correspond to the input columns in the same position.

ggrothendieck avatar Mar 06 '23 13:03 ggrothendieck

I don't remember writing this code but I wouldn't be surprised if I did. I agree that it would be desirable to optionally retrain the 0 columns, and arguably to make that the default, although it do so might break existing code and so might not be a good idea.

AFAICS, the documentation for the function doesn't mention removing linearly dependent columns and indeed implies the contrary. Also, silently removing the columns is probably not desirable; printing a warning could break existing code but a message is probably in order.

I can make these changes if people agree.

Thoughts?

John

On 2023-03-06 8:12 a.m., ggrothendieck wrote:

Currently |GramSchmidt| contains the line below which removes the redundant columns

|B <- B[, !apply(B, 2, function(b) all(b == 0))] |

however, that makes it hard to relate the result to the input matrix. Suggest that there be an argument to skip over that line so that the output columns directly correspond to the input columns in the same position.

— Reply to this email directly, view it on GitHub https://github.com/friendly/matlib/issues/48, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADLSAQRDH62CHRKGFCWTBVLW2XPFNANCNFSM6AAAAAAVRDLSTM. You are receiving this because you are subscribed to this thread.Message ID: @.***>

john-d-fox avatar Mar 07 '23 22:03 john-d-fox

I agree that optionally allowing to retain the 0 columns is desireable, and I think the suggestion to make the quoted line conditional would do that.

An argument could be made either way for what should be the default, but at least it would be documented.

Perhaps review the examples / vignettes to see how GramSchmidt() is currently used in the package?

friendly avatar Mar 07 '23 22:03 friendly

On 2023-03-07 5:48 p.m., Michael Friendly wrote:

I agree that optionally allowing to retain the 0 columns is desireable, and I think the suggestion to make the quoted line conditional would do that.

As it turns out, not quite, because it's necessary to account for retained 0 columns if the matrix is normalized. I've done that and also added a message when 0 columns are suppressed.

An argument could be made either way for what should be the default, but at least it would be documented.

Perhaps review the examples / vignettes to see how GramSchmidt() is currently used in the package?

Please feel free to change the default if you think that's best.

My changes are in the issue42 branch.

John

— Reply to this email directly, view it on GitHub https://github.com/friendly/matlib/issues/48#issuecomment-1458984764, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADLSAQUHA6OJ42HCE4SFJ3DW263MZANCNFSM6AAAAAAVRDLSTM. You are receiving this because you commented.Message ID: @.***>

john-d-fox avatar Mar 08 '23 01:03 john-d-fox

Accepted John's changes, now in master. I've bumped the pkg version to 0.8-7, but will wait to release this.

@ggrothendieck You can (after this settles down) install the github version to test this.

friendly avatar Mar 08 '23 22:03 friendly

Seems to work when I tried it.

Since the output is now the same shape as the input one could retain the column names to make it easier to relate the output to the input -- actually it could be done even if zero columns were omitted.

transform(BOD, z = Time + demand, zz = rnorm(6)) |> 
  as.matrix() |>
  GramSchmidt(omit_zero_columns = FALSE)
##            [,1]        [,2] [,3]        [,4]
## [1,] 0.09805807  0.28210212    0 -0.02308851
## [2,] 0.19611614  0.24258721    0  0.59730121
## [3,] 0.29417420  0.54510946    0  0.39265953
## [4,] 0.39223227  0.25034293    0 -0.61734941
## [5,] 0.49029034 -0.70807780    0  0.27453213
## [6,] 0.68640647  0.01948748    0 -0.17896507

ggrothendieck avatar Mar 15 '23 12:03 ggrothendieck

Preserving the column names makes sense to me (and is easily done). Shall I make this change?

john-d-fox avatar Mar 15 '23 16:03 john-d-fox

@ggrothendieck @john-d-fox This change seems sensible, but I'd like to see an example of what omit_zero_columns = TRUE should/does look like. Does this change still work if the matrix has row/col names?

friendly avatar Mar 15 '23 16:03 friendly

The names of the nonzero columns are retained. Yes, the change works if there are row/column names. Row names always worked, and retaining column names is the point.

Shall I commit the change? If you don’t like it, you can always revert it (though I doubt you’ll want to).

john-d-fox avatar Mar 15 '23 16:03 john-d-fox

The example shown in my earlier post using BOD had input column names but no column names appeared on the output.

ggrothendieck avatar Mar 15 '23 22:03 ggrothendieck

Right. I'm waiting to hear from Michael before committing the fix, which displays the column names if they're defined.

john-d-fox avatar Mar 15 '23 22:03 john-d-fox

FWIW, here's the code:

GramSchmidt <- function (X, normalize = TRUE, verbose = FALSE,
                         tol=sqrt(.Machine$double.eps), omit_zero_columns=TRUE) {
  B <- X
  B[, 2L:ncol(B)] <- 0
  if (verbose) {
    cat("\nInitial matrix:\n")
    printMatrix(X)
    cat("\nColumn 1: z1 = x1\n")
    printMatrix(B)
  }
  for (i in 2:ncol(X)) {
    res <- X[, i]
    for (j in 1:(i - 1)) res <- res - Proj(X[, i], B[, j])
    if (len(res) < tol) res <- 0
    B[, i] <- res
    if (verbose) {
      prj_char <- character(i - 1)
      for (j in 1:(i - 1)) prj_char[j] <- sprintf("Proj(x%i, z%i)",
                                                  i, j)
      cat(sprintf("\nColumn %i: z%i = x%i - %s\n", i, i,
                  i, paste0(prj_char, collapse = " - ")))
      printMatrix(B)
    }
  }
  zeros <- apply(B, 2, function(b) all(b == 0))
  if (omit_zero_columns && any(zeros)){
    B <- B[, !zeros]
    which_zeros <- (1:ncol(X))[zeros]
    message("linearly dependent column", if (length(which_zeros) > 1) "s" else "",
            " omitted: ", paste(which_zeros, collapse=", "))
  }
  if (normalize) {
    norm <- diag(1/len(B))
    colnames <- colnames(B)
    B <- B %*% norm
    colnames(B) <- colnames
    if (!omit_zero_columns && any(zeros)) B[, zeros] <- 0
    if (verbose) {
      cat("\nNormalized matrix: Z * inv(L) \n")
      printMatrix(B)
    }
  }
  if (verbose)
    return(invisible(B))
  B
}

john-d-fox avatar Mar 15 '23 22:03 john-d-fox

Looks good. Go ahead and commit to master.

friendly avatar Mar 16 '23 02:03 friendly

That's apparently easier said than done. When I tried to push the change, I got

>>> /usr/bin/git push origin HEAD:refs/heads/master
remote: error: GH006: Protected branch update failed for refs/heads/master.        
remote: error: At least 1 approving review is required by reviewers with write access.        
To https://github.com/friendly/matlib.git
 ! [remote rejected] HEAD -> master (protected branch hook declined)
error: failed to push some refs to 'https://github.com/friendly/matlib.git'

I suggest that you take the code from my previous "comment" to update GramSchmidt() in the master branch since I apparently am not allowed to do it.

john-d-fox avatar Mar 16 '23 15:03 john-d-fox

Ugh! This is some new github feature regarding permission I don't quite understand. Evidently, new additions to master must be via a pull request from another branch, or a pull request to the master branch that someone else (me) has to approve. In any case, it has to be via a pull request.

I don't think I can just copy/paste your code in the comment into the GramSchmidt.R.

I created a new branch, devel-0.9-7. Can you do a pull from RStudio git, switch to the devel-0.9-7 branch, and commit your update there? and then push to that branch... Then, either you or I can do the pull request thing. I'm sorry that this has become an extra burden.

>>> C:/Program Files/Git/bin/git.exe checkout -B devel-0.9.7
Switched to a new branch 'devel-0.9.7'
>>> C:/Program Files/Git/bin/git.exe push -u origin devel-0.9.7
remote: 
remote: Create a pull request for 'devel-0.9.7' on GitHub by visiting:        
remote:      https://github.com/friendly/matlib/pull/new/devel-0.9.7        
remote: 
branch 'devel-0.9.7' set up to track 'origin/devel-0.9.7'.
To github.com:friendly/matlib.git
 * [new branch]      devel-0.9.7 -> devel-0.9.7

friendly avatar Mar 16 '23 17:03 friendly

OK, I pushed the change to GramSchmidt() to the devel-0.9-7. Can you take it from here?

john-d-fox avatar Mar 16 '23 17:03 john-d-fox