ensmallen icon indicating copy to clipboard operation
ensmallen copied to clipboard

AugLagrangian with constraints validation

Open jasonjfoster opened this issue 2 years ago • 6 comments

Issue description

Hi, I'm using ens::AugLagrangian to minimize variance with constraints (weights sum to one) and struggling to validate the results vs another optimizer. The constraints appear to be satisfied but the ensmallen result remains greater than the other optimizer result. I followed another example (see https://github.com/mlpack/ensmallen/issues/199) but still must be missing something and would appreciate the help.

Your environment

  • version of ensmallen: 2.19.0
  • operating system: Windows
  • compiler: g++
  • version of Armadillo: 11.2.2
  • any other environment information you think is relevant:

Steps to reproduce

class MinVarFunction {
  
public:
  
  MinVarFunction(const arma::mat& sigma)
    : sigma(sigma) { }
  
  double Evaluate(const arma::mat& weights) {
    return 0.5 * as_scalar(trans(weights) * sigma * weights);
  }
  
  void Gradient(const arma::mat& weights, arma::mat& g) {
    g = sigma * weights;
  }
  
  size_t NumConstraints() {
    return 1;
  }
  
  double EvaluateConstraint(const size_t i, const arma::mat& weights) {
    return as_scalar(sum(weights)) - 1;
  }
  
  void GradientConstraint(const size_t i, const arma::mat& weights,
                          arma::mat& g) {
    
    double status = as_scalar(sum(weights)) - 1;
    arma::mat g_tmp(weights.n_rows, weights.n_cols, arma::fill::zeros);
    
    g_tmp[i] = status == 0 ? 0 : 1;
    g = g_tmp;
    
  }
  
private:
  
  const arma::mat& sigma;
  
};

arma::mat ens_min_var(const arma::mat& sigma) {
  
  MinVarFunction min_var(sigma);
  
  ens::AugLagrangian al;
  
  arma::mat weights(sigma.n_cols, 1, arma::fill::ones);
  
  al.Optimize(min_var, weights);
  
  return weights;
  
}

Expected behavior

In Python:

import numpy as np
from scipy.optimize import minimize

mu = [0.0427, 0.0015, 0.0285]

sigma = [[0.0100, 0.0018, 0.0011],
         [0.0018, 0.0109, 0.0026],
         [0.0011, 0.0026, 0.0199]]

start = np.array([1, 1, 1])
cons = [{"type": "eq", "fun": lambda params: np.sum(params) - 1}]

def min_var_obj(params, sigma):
    
    result = np.matmul(np.transpose(params), np.matmul(sigma, params))
    
    return result

def scipy_min_var(params, sigma):
    
    result = minimize(min_var_obj, params, args = (sigma), constraints = cons)
    
    return result.x

weights = scipy_min_var(start, sigma)

print(sum(weights))
# 1.0

print(scipy_min_var(start, sigma))
# [0.43235641 0.37630212 0.19134147]

print(weights @ sigma @ weights)
# 0.005283486762168229

Actual behavior

Using ensmallen:

weights = ens_min_var(sigma)

sum(weights);
// 1.0

weights;
// [-0.7607172 0.913307 0.8474102]

trans(weights) * sigma * weights;
// 0.02927433

As you can see, the constraints appear to sum to one but the ensmallen result is greater than the other optimizer result. I must be missing something and would appreciate the help.

jasonjfoster avatar Jul 21 '22 16:07 jasonjfoster

Hello, you are expecting the same results from both implementations? The scipy and ensmallen implementation differs slightly, so I'm not surprised that we don't see the same results especially if we use different default values. Can you rerun the example with:

al.Optimize(min_var, weights, Report(0.1));

What I like to see is if, the optimization runs for the full 1000 iterations (default).

zoq avatar Jul 26 '22 19:07 zoq

Thanks and agree the scipy implementation might differ slightly so here's an exact solution using Lagrangian multipliers via armadillo instead (that is similar to scipy's results):

arma::mat arma_min_var(const arma::mat& sigma) {
  
  int n_cols = sigma.n_cols;
  int n_size = n_cols + 1;
  arma::vec arma_ones(n_cols, arma::fill::ones);
  arma::mat A(n_size, n_size, arma::fill::zeros);
  arma::vec b(n_size, arma::fill::zeros);
  arma::mat weights(n_cols, 1);
  
  A.submat(0, 0, n_cols - 1, n_cols - 1) = sigma;
  A.submat(0, n_cols, n_cols - 1, n_cols) = arma_ones;
  A.submat(n_cols, 0, n_cols, n_cols - 1) = trans(arma_ones);
  
  b(n_size - 1) = 1;
  
  weights = arma::solve(A, b);
  
  return weights.submat(0, 0, n_cols - 1, 0);
  
}

Using armadillo:

weights = arma_min_var(sigma)

sum(weights);
// 1.0

weights;
// [ 0.4411, 0.3656, 0.1933]

trans(weights) * sigma * weights;
// 0.005281811

For reference, attached the report when the initial parameters are ones (as initial report above) and also when the exact solution from armadillo is used too. Both results are still greater than expected but providing optimal starting values gets closer (although still not exact).

Do you think it's related to starting values and / or the # of iterations? Any advice on setting these (or other) parameters would be much appreciated.

jasonjfoster avatar Jul 27 '22 01:07 jasonjfoster

My memory from years ago is that it can be tricky to make this implementation of the augmented Lagrangian method converge, but, that is only in the context of what it was written for---which was solving SDPs. (If you are interested, see Monteiro + Burer 2004.)

Try adding #define ENS_PRINT_INFO to your program, and I think maybe it will be easier to see what is going on. Perhaps sigmaUpdateFactor is too large? Maybe optimization is terminating for some incorrect reason.

rcurtin avatar Aug 24 '22 20:08 rcurtin

Thanks for the insight and guidance. I added #define ENS_PRINT_INFO into my program which has been helpful to try out various combinations of penaltyThresholdFactor and sigmaUpdateFactor but still have not been able to tie out the results yet. For reference, here is the print info for sigmaUpdateFactor = 10 (default) and sigmaUpdateFactor = 1.1 when the exact solution from armadillo is used as the initial values.

jasonjfoster avatar Sep 05 '22 21:09 jasonjfoster

There is a tradeoff between penaltyThresholdFactor and sigmaUpdateFactor. Basically, the way the augmented Lagrangian optimizer works is that it adds a "soft" penalty corresponding to the constraints to the objective, and asks L-BFGS to solve this modified objective. When L-BFGS solves this modified objective successfully, we increase sigma by multiplying by sigmaUpdateFactor, and then we also decrease the penalty threshold for when we will update sigma again.

So, depending on the problem being solved, you may need a very small sigmaUpdateFactor, so that the penalty term does not overwhelm the objective too quickly. I played with your example and was able to make it converge to 0.532825 with a penaltyThresholdFactor of 0.05 and sigmaUpdateFactor of 1.04, which is pretty close to the closed-form solution you gave, but it seemed like the result was fairly sensitive to the parameters.

I assume scipy is using a different algorithm here, and it may be that the augmented Lagrangian algorithm is simply not a great choice for this problem.

rcurtin avatar Sep 14 '22 22:09 rcurtin

Thanks for the explanation and trying out my example. Would you happen to recommend another solver for my problem instead? Specifically I'm just trying to solve the standard "minimize portfolio variance with weights that sum up to one" problem.

jasonjfoster avatar Sep 15 '22 17:09 jasonjfoster

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! :+1:

mlpack-bot[bot] avatar Oct 15 '22 18:10 mlpack-bot[bot]

Hey, I'm sorry for the slow response here. This fell off my list somehow. Unfortunately I don't actually have another recommendation here---ensmallen's support for constrained optimizers is not extensive, although I would certainly like to see that improve in future releases.

rcurtin avatar Oct 18 '22 19:10 rcurtin