ensmallen
ensmallen copied to clipboard
AugLagrangian with constraints validation
Issue description
Hi, I'm using ens::AugLagrangian
to minimize variance with constraints (weights sum to one) and struggling to validate the results vs another optimizer. The constraints appear to be satisfied but the ensmallen result remains greater than the other optimizer result. I followed another example (see https://github.com/mlpack/ensmallen/issues/199) but still must be missing something and would appreciate the help.
Your environment
- version of ensmallen: 2.19.0
- operating system: Windows
- compiler: g++
- version of Armadillo: 11.2.2
- any other environment information you think is relevant:
Steps to reproduce
class MinVarFunction {
public:
MinVarFunction(const arma::mat& sigma)
: sigma(sigma) { }
double Evaluate(const arma::mat& weights) {
return 0.5 * as_scalar(trans(weights) * sigma * weights);
}
void Gradient(const arma::mat& weights, arma::mat& g) {
g = sigma * weights;
}
size_t NumConstraints() {
return 1;
}
double EvaluateConstraint(const size_t i, const arma::mat& weights) {
return as_scalar(sum(weights)) - 1;
}
void GradientConstraint(const size_t i, const arma::mat& weights,
arma::mat& g) {
double status = as_scalar(sum(weights)) - 1;
arma::mat g_tmp(weights.n_rows, weights.n_cols, arma::fill::zeros);
g_tmp[i] = status == 0 ? 0 : 1;
g = g_tmp;
}
private:
const arma::mat& sigma;
};
arma::mat ens_min_var(const arma::mat& sigma) {
MinVarFunction min_var(sigma);
ens::AugLagrangian al;
arma::mat weights(sigma.n_cols, 1, arma::fill::ones);
al.Optimize(min_var, weights);
return weights;
}
Expected behavior
In Python:
import numpy as np
from scipy.optimize import minimize
mu = [0.0427, 0.0015, 0.0285]
sigma = [[0.0100, 0.0018, 0.0011],
[0.0018, 0.0109, 0.0026],
[0.0011, 0.0026, 0.0199]]
start = np.array([1, 1, 1])
cons = [{"type": "eq", "fun": lambda params: np.sum(params) - 1}]
def min_var_obj(params, sigma):
result = np.matmul(np.transpose(params), np.matmul(sigma, params))
return result
def scipy_min_var(params, sigma):
result = minimize(min_var_obj, params, args = (sigma), constraints = cons)
return result.x
weights = scipy_min_var(start, sigma)
print(sum(weights))
# 1.0
print(scipy_min_var(start, sigma))
# [0.43235641 0.37630212 0.19134147]
print(weights @ sigma @ weights)
# 0.005283486762168229
Actual behavior
Using ensmallen:
weights = ens_min_var(sigma)
sum(weights);
// 1.0
weights;
// [-0.7607172 0.913307 0.8474102]
trans(weights) * sigma * weights;
// 0.02927433
As you can see, the constraints appear to sum to one but the ensmallen result is greater than the other optimizer result. I must be missing something and would appreciate the help.
Hello, you are expecting the same results from both implementations? The scipy and ensmallen implementation differs slightly, so I'm not surprised that we don't see the same results especially if we use different default values. Can you rerun the example with:
al.Optimize(min_var, weights, Report(0.1));
What I like to see is if, the optimization runs for the full 1000 iterations (default).
Thanks and agree the scipy implementation might differ slightly so here's an exact solution using Lagrangian multipliers via armadillo instead (that is similar to scipy's results):
arma::mat arma_min_var(const arma::mat& sigma) {
int n_cols = sigma.n_cols;
int n_size = n_cols + 1;
arma::vec arma_ones(n_cols, arma::fill::ones);
arma::mat A(n_size, n_size, arma::fill::zeros);
arma::vec b(n_size, arma::fill::zeros);
arma::mat weights(n_cols, 1);
A.submat(0, 0, n_cols - 1, n_cols - 1) = sigma;
A.submat(0, n_cols, n_cols - 1, n_cols) = arma_ones;
A.submat(n_cols, 0, n_cols, n_cols - 1) = trans(arma_ones);
b(n_size - 1) = 1;
weights = arma::solve(A, b);
return weights.submat(0, 0, n_cols - 1, 0);
}
Using armadillo:
weights = arma_min_var(sigma)
sum(weights);
// 1.0
weights;
// [ 0.4411, 0.3656, 0.1933]
trans(weights) * sigma * weights;
// 0.005281811
For reference, attached the report when the initial parameters are ones (as initial report above) and also when the exact solution from armadillo is used too. Both results are still greater than expected but providing optimal starting values gets closer (although still not exact).
Do you think it's related to starting values and / or the # of iterations? Any advice on setting these (or other) parameters would be much appreciated.
My memory from years ago is that it can be tricky to make this implementation of the augmented Lagrangian method converge, but, that is only in the context of what it was written for---which was solving SDPs. (If you are interested, see Monteiro + Burer 2004.)
Try adding #define ENS_PRINT_INFO
to your program, and I think maybe it will be easier to see what is going on. Perhaps sigmaUpdateFactor
is too large? Maybe optimization is terminating for some incorrect reason.
Thanks for the insight and guidance. I added #define ENS_PRINT_INFO
into my program which has been helpful to try out various combinations of penaltyThresholdFactor
and sigmaUpdateFactor
but still have not been able to tie out the results yet. For reference, here is the print info for sigmaUpdateFactor = 10
(default) and sigmaUpdateFactor = 1.1
when the exact solution from armadillo is used as the initial values.
There is a tradeoff between penaltyThresholdFactor
and sigmaUpdateFactor
. Basically, the way the augmented Lagrangian optimizer works is that it adds a "soft" penalty corresponding to the constraints to the objective, and asks L-BFGS to solve this modified objective. When L-BFGS solves this modified objective successfully, we increase sigma
by multiplying by sigmaUpdateFactor
, and then we also decrease the penalty threshold for when we will update sigma
again.
So, depending on the problem being solved, you may need a very small sigmaUpdateFactor
, so that the penalty term does not overwhelm the objective too quickly. I played with your example and was able to make it converge to 0.532825
with a penaltyThresholdFactor
of 0.05
and sigmaUpdateFactor
of 1.04
, which is pretty close to the closed-form solution you gave, but it seemed like the result was fairly sensitive to the parameters.
I assume scipy is using a different algorithm here, and it may be that the augmented Lagrangian algorithm is simply not a great choice for this problem.
Thanks for the explanation and trying out my example. Would you happen to recommend another solver for my problem instead? Specifically I'm just trying to solve the standard "minimize portfolio variance with weights that sum up to one" problem.
This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! :+1:
Hey, I'm sorry for the slow response here. This fell off my list somehow. Unfortunately I don't actually have another recommendation here---ensmallen's support for constrained optimizers is not extensive, although I would certainly like to see that improve in future releases.