argmin
argmin copied to clipboard
LBFGS with constraints on x
First of all I want to say thanks to all the contributors of this crate, you've done an amazing job.
I'm getting started with argmin running some basic examples. I wanted to run a simple constrained optimization problem finding the minimum of a halfine (y=x with x>=0) starting from the origin using LBFGS as solver. Of course the solution to this problem is x=0 and the minimum is zero. I've implemented the CostFunction and Gradient traits as follows:
#[derive(Clone)]
struct Problem{
x: Arc<Mutex<Floating>>
}
impl Problem {
fn new() -> Self {
Self {
x: Arc::new(Mutex::new(5.0))
}
}
fn f(&self, x: Floating) -> Floating {
let mut changeable_x = self.x.lock().unwrap();
*changeable_x = x.clone();
println!("x:{:?}",x);
if x<self.x_lower_bound() {
return self.sup()
}
return x
}
fn grad_f(&self, x: Floating) -> Vec<Floating> {
// vec![2.0 * x]
// vec![2.0*(x-4.0)]
if x<=self.x_lower_bound() {
return vec![0.0]
}
return vec![1.0]
}
fn x_lower_bound(&self) -> Floating {
0.0
}
fn sup (&self) -> Floating {
100.0
}
}
impl CostFunction for Problem {
type Param = Vec<Floating>;
type Output = Floating;
fn cost(&self, x: &Self::Param) -> Result<Self::Output, Error> {
let output = self.f(x[0]);
println!("image: {:?}",output);
Ok(output)
// Ok(output)
}
}
impl Gradient for Problem {
type Param = Vec<Floating>;
type Gradient = Vec<Floating>;
fn gradient(&self, x: &Self::Param) -> Result<Self::Gradient, Error> {
let output = self.grad_f(x[0]);
println!("gradient: {:?}",output);
Ok(output)
}
}
Where Floating is a type alias for f64.
I've tried to incorporate the constraint x>=0 returning 100.0 in case x>0 (I tried returning f64::INFINITY but the code used to panic with error MoreThuenteLineSearch: NaN or Inf encountered during iteration).
This is the code for solving the problem:
#[test]
fn test_min() {
let linesearch: MoreThuenteLineSearch<Vec<f64>, Vec<f64>, f64> = MoreThuenteLineSearch::new();
let initial_guess: Vec<f64> = vec![10.0; 1];
let lbfgs: LBFGS<MoreThuenteLineSearch<Vec<f64>, Vec<f64>, f64>, Vec<f64>, Vec<f64>, f64>= LBFGS::new(linesearch, 10);
let cost = Problem::new();
let res = Executor::new(cost.clone(), lbfgs)
.configure(|state| state.param(initial_guess).max_iters(100))
.add_observer(SlogLogger::term(), ObserverMode::Always)
.run()
.unwrap();
println!("Solution: {:?}", res.state);
}
By setting a number of iterations equal to 1, it solves the problem as follows:
running 1 test
x:10.0
image: 10.0
gradient: [1.0]
x:9.0
image: 9.0
May 14 17:55:34.046gradient: [1.0]
x:5.0
image: 5.0
gradient: [1.0]
INFOx:-11.0
image: 100.0
gradient: [0.0]
L-BFGSx:-5.5600000000000005
image: 100.0
gradient: [0.0]
, x:4.814681808522201
image: 4.814681808522201
max_itersgradient: [1.0]
:x:-2.0326081851024504
image: 100.0
gradient: [0.0]
x:0.29547041272993013
1image: 0.29547041272993013
gradient: [1.0]
x:0.29547041272993013
image: 0.29547041272993013
gradient: [1.0]
gradient: [1.0]
May 14 17:55:34.057 INFO iter: 0, cost: 0.29547041272993013, best_cost: 0.29547041272993013, gradient_count: 10, cost_count: 9, time: 0.0106955, gamma: 1
Solution: IterState { param: Some([0.29547041272993013]), prev_param: None, best_param: Some([0.29547041272993013]), prev_best_param: Some([10.0]), cost: 0.29547041272993013, prev_cost: 10.0, best_cost: 0.29547041272993013, prev_best_cost: 10.0, target_cost: -inf, grad: Some([1.0]), prev_grad: None, hessian: None, prev_hessian: None, inv_hessian: None, prev_inv_hessian: None, jacobian: None, prev_jacobian: None, iter: 1, last_best_iter: 0, max_iters: 1, counts: {"cost_count": 9, "gradient_count": 10}, time: Some(12.3175ms), termination_status: Terminated(MaxItersReached) }
And of course the solution is close to zero if not exactly zero. However, by setting a number of iterations >1, the code loops endlessly reamaining stuck at x=NaN, objective=NaN and gradient=1
x:NaN
image: NaN
gradient: [1.0]
Any thoughts?
Apologies for the very very late reply! I have been very busy over the last couple of months and my access to the internet is currently limited (#360).
Thanks for this detailed and helpful report. Unfortunately I wasn't able to test this myself yet. As a first guess I think that the problem is that you set the gradient 0 whenever the solver it outside of the feasible range. Could you try to change your setup such that the gradient increases the further away it is from the feasible range?