argmin icon indicating copy to clipboard operation
argmin copied to clipboard

CMA-ES

Open Armavica opened this issue 6 years ago • 19 comments

Hello, CMA-ES is a powerful evolutionary algorithm for black-box optimization problems. If there is interest in adding it to this optimization suite, I would be happy to give it a try. My plan is to start by following the purecma.py implementation, a public domain pedagogical implementation by the author of the algorithm, and add more advanced features if necessary.

Armavica avatar Mar 16 '19 19:03 Armavica

Hi,

Thanks for your interest in contributing to argmin! CMA-ES would be a great addition to argmin and I'd very much appreciate it if you would implement this.

We've recently made a lot of changes (#3 ) to the code, therefore it could easily happen that you run into unforeseen problems. Also, the documentation isn't everywhere up to date yet and may sometimes be not very helpful, outright wrong or may not exist at all. Please don't hesitate to get in touch in case you have any problems, suggestions of if your code requires changes to the overall design of argmin.

stefan-k avatar Mar 16 '19 21:03 stefan-k

Hi @Armavica,

Have you had a chance to give this a try yet? Is there anything I can help you with?

stefan-k avatar Apr 24 '19 08:04 stefan-k

Hi! Sorry for the delay, I have been busy and out of the country for the last few weeks. I do have a working prototype, and I wanted to add more tests before submitting it. I will do that as soon as I find time.

Armavica avatar Apr 24 '19 14:04 Armavica

Sounds great! :) There's no rush, take your time!

stefan-k avatar Apr 24 '19 14:04 stefan-k

I would be interested in implementing this algorithm and others for a chemical yield optimisation project at work. I have only recently started learning Rust but am happy to try and develop further with help. If this has already been done then perhaps there is something else I can help with.

PyPete avatar Jan 06 '20 20:01 PyPete

Hi @PyPete ,

thanks for your interest in contributing! I haven't heard anything from @Armavica in a while, so I'm not sure what the current status of this is. I'm still very much interested in an implementation of CMA-ES. Unless Armavica objects I would suggest that you start working on this feature if you like. Should Armavica come back to this I'm sure that there will be the possibility to merge the implementations.

I'm happy to help you with the implementation and guide you. Just open a WIP pull request and don't hesitate to ask any questions :)

stefan-k avatar Jan 07 '20 08:01 stefan-k

Hi @stefan-k, I'll take care of this issue, if you don't mind! I already have a working implementation of this algorithm in Scala https://github.com/VolodymyrOrlov/cma-es-scala and I think I have a good understanding on how the method works

VolodymyrOrlov avatar May 26 '22 23:05 VolodymyrOrlov

@VolodymyrOrlov sounds great! :)

The only other population based method so far is particle swarm optimization which uses PopulationState to carry state from one iteration to the next. PopulationState is therefore somewhat "tailored" to PSO and as such you may have to adapt it to your needs. Let me know if you have any questions or run into any problems!

stefan-k avatar May 29 '22 06:05 stefan-k

Agree! Looks like my state will have to carry on a NxP 2d array, where N is my population size and P is number of parameters. This way I'll be able to quickly perform Eigendecomposition operation, which is a central operation in CMA-ES.

VolodymyrOrlov avatar Jun 01 '22 02:06 VolodymyrOrlov

I also need to add these new operations to the argmin-math:

  • argsort: https://numpy.org/doc/stable/reference/generated/numpy.argsort.html
  • eigSym: https://www.rdocumentation.org/packages/MGMM/versions/0.3.1/topics/eigSym
  • np.random.standard_normal: https://numpy.org/doc/stable/reference/random/generated/numpy.random.standard_normal.html
  • np.outer: https://numpy.org/doc/stable/reference/generated/numpy.outer.html
  • np.take: https://numpy.org/doc/stable/reference/generated/numpy.take.html

I hope you don't mind me implementing all these new methods and traits for CMA-ES :)

VolodymyrOrlov avatar Jun 01 '22 02:06 VolodymyrOrlov

Looks like my state will have to carry on a NxP 2d array, where N is my population size and P is number of parameters. This way I'll be able to quickly perform Eigendecomposition operation, which is a central operation in CMA-ES.

Currently the population is stored as a Vec<P> in PopulationState. I think this can be implemented in a more flexible way with generics which would allow both your NxP 2D array and Vec<P>.

If you need to carry state which is very specific to CMA-ES, then you can of course store them in your CMA-ES struct.

I hope you don't mind me implementing all these new methods and traits for CMA-ES :)

Not at all! :)

np.outer: https://numpy.org/doc/stable/reference/generated/numpy.outer.html

I think this is already covered with ArgminDot, at least for ndarray: https://github.com/argmin-rs/argmin/blob/2943e256800196ac800d477cd3f99def8a08c142/argmin-math/src/ndarray_m/dot.rs#L35

But I have no objections to split this off into an ArgminOuter trait since it's more explicit. Also, the current implementation looks awfully inefficient.

stefan-k avatar Jun 01 '22 05:06 stefan-k

@stefan-k , while working on CMA-ES I've stumbled upon this implementation of a dot product between 2 matrices: https://github.com/argmin-rs/argmin/blob/main/argmin-math/src/vec/dot.rs#L51 It seems to be some kind of task-specific implementation. Not sure where is it used in the code. Can I replace it with a more traditional dot product, similar to np.dot?

VolodymyrOrlov avatar Jun 06 '22 23:06 VolodymyrOrlov

It is needed (at least) in BFGS. Both ndarray and np.dot compute the matrix product when given two 2D matrices. The ArgminDot implementation for Vecs should also do so (unless there is a bug). What would you want to replace it with?

stefan-k avatar Jun 07 '22 05:06 stefan-k

Current implementation fails this test: image

#[test]
fn [<test_mat_mat_2_ $t>]() {
    let a = vec![
        vec![1 as $t, 2 as $t, 3 as $t],
        vec![4 as $t, 5 as $t, 6 as $t]
    ];
    let b = vec![
        vec![1 as $t, 2 as $t],
        vec![3 as $t, 4 as $t],
        vec![5 as $t, 6 as $t]
    ];
    let expected = vec![
        vec![22 as $t, 28 as $t],
        vec![49 as $t, 64 as $t],                        
    ];
    
    let product = a.dot(&b);

    println!("{:?}", product);

    assert_eq!(product.len(), 2);
    assert_eq!(product[0].len(), 2);

    product.iter().flatten().zip(expected.iter().flatten()).for_each(|(&a, &b)| assert!((a as f64 - b as f64).abs() < std::f64::EPSILON));
}

Also, why do we need to transpose second matrix right before dot product? I have another implementation that works with existing test, where you multiply 3x3 matrices, as well as with a new test

VolodymyrOrlov avatar Jun 07 '22 16:06 VolodymyrOrlov

Also, I am not sure I understand why this test should fail:

item! {
#[test]
#[should_panic]
fn [<test_mat_mat_panic_4_ $t>]() {
    let a = vec![
        vec![1 as $t, 2 as $t, 3 as $t],
        vec![4 as $t, 5 as $t, 6 as $t],
        vec![3 as $t, 2 as $t, 1 as $t]
    ];
    let b = vec![
        vec![3 as $t, 2 as $t],
        vec![6 as $t, 5 as $t],
        vec![3 as $t, 2 as $t]
    ];
    a.dot(&b);
}
}

image

VolodymyrOrlov avatar Jun 07 '22 16:06 VolodymyrOrlov

I've had a closer look and I agree, the transpose is complete nonsense. Not only that, the entire implementation makes no sense at all! This is quite embarrassing. I have no idea what I was thinking.

Sorry about any inconvenience this has caused and thanks for digging into this! I would appreciate it if you could replace that with your working implementation.

stefan-k avatar Jun 08 '22 07:06 stefan-k

No worries, @stefan-k ! There are no "right" and "wrong" implementations, only implementations that works and ones that don't. I assume existing implementation works just fine for BFGS, hence it is good enough. But I will replace it with another version, that can handle both my and BFGS use cases.

VolodymyrOrlov avatar Jun 08 '22 18:06 VolodymyrOrlov

I'd argue the current implementation is definitely wrong, even for BFGS ;) The only reason why nobody noticed/complained is likely because people tend to (and should) use the ndarray/nalgebra math backends for matrix/linalg-heavy solvers. The Vec based math backend is neither efficient nor complete (for instance, ArgminInv is missing). The ArgminDot implementations for both the nalgebra and ndarray backends should do the right thing, because they just forward the calls to the respective functions of the crates.

stefan-k avatar Jun 08 '22 18:06 stefan-k

Hi @stefan-k! I have a first version of the CMA-ES implementation ready for your review in this PR. This version can definitely be improved in the future, but as a MVP I think it is good enough.

VolodymyrOrlov avatar Jun 24 '22 21:06 VolodymyrOrlov