zuko icon indicating copy to clipboard operation
zuko copied to clipboard

Add option to change covariance matrix type for GMM class

Open dominik-strutz opened this issue 10 months ago • 6 comments

This PR adds changes to the zuko.flows.mixture.GMM class, which allow the user to change the type of the covariance matrix used for each of the Gaussian components of the mixture.

The options added are

  • covariance_type, which allows to change the type of the covariance matrices
  • tied a switch which allows to control if covariance matrices are tied between components
  • cov_rank the rank of the low-rank covariance matrix when covariance_type is 'lowrank'

Since the construction of the shapes got quite long I moved this part in its own function.

Below is an illustration of the effect these different choices have for a mixture of 3 two-dimensional Gaussians.

image

dominik-strutz avatar Apr 03 '24 13:04 dominik-strutz

Hello @dominik-strutz, I quickly went over the code, and it looks nice! I think we should add some tests however, maybe in a new tests/test_flows_gmm.py file.

Are you also planning to improve the initialization as well?

francois-rozet avatar Apr 03 '24 15:04 francois-rozet

Hi @francois-rozet, Yes, I am happy to write some tests.

I'm also happy to try to improve the initialization. Following sklearn again, the initialization methods 'random' and 'random_from_data' (the latter one working quite well in my limited experience) should be easy enough to implement. Also, implementing 'k-mean' or 'k-means++' should be manageable. I don't know how to handle the conditional case yet, but having only the unconditional one would be a good first start.

What is your opinion on how to structure the initialization? I think it would be beneficial to keep the __init__ method quite simple and have a separate method (e.g., GMM.initialization) that takes user-provided data samples and sets the phi variable or the last layer of the network to reasonable values given an initialization method.

I will give the initialisation a try and let you know how it goes.

P.S: I have no idea why the pre-commit hook fails. I used ruff --fix locally to reformat, and from what I can see, it fulfils the requirements.

dominik-strutz avatar Apr 04 '24 08:04 dominik-strutz

I don't know how to handle the conditional case yet, but having only the unconditional one would be a good first start.

I think a good way to handle the conditional case would be to make the weight $W$ of the last layer small (e.g. standard initialization * 1e-2) and set the bias $b$ to the unconditional initialization.

What is your opinion on how to structure the initialization?

I agree that a separate method could be appropriate, similar to the reset_parameters of nn.Linear.

P.S: I have no idea why the pre-commit hook fails. I used ruff --fix locally to reformat, and from what I can see, it fulfils the requirements.

I pulled your branch and ruff check . at the root of the repo returns

zuko/flows/mixture.py:7:1: I001 [*] Import block is un-sorted or un-formatted
Found 1 error.
[*] 1 fixable with the `--fix` option.

Maybe you were not at the root? My version of ruff is 0.1.14 by the way.

francois-rozet avatar Apr 04 '24 13:04 francois-rozet

@dominik-strutz Do you still plan on contributing this PR?

francois-rozet avatar Jun 06 '24 22:06 francois-rozet

Yes, I still like to contribute but haven't found much free time to do it recently. I have implemented most of the initialization methods for the unconditional case, but it still needs to be polished up and tested. The extension for the conditional case shouldn't take too long afterwards. If you or someone else wants to continue this sooner, I'm happy to push an intermediary commit of everything I have so far.

dominik-strutz avatar Jun 07 '24 09:06 dominik-strutz

No problem, take your time! I am currently updating a few things and wanted to know if I should wait for this PR for the next minor release.

francois-rozet avatar Jun 07 '24 09:06 francois-rozet