zuko
zuko copied to clipboard
Add option to change covariance matrix type for GMM class
This PR adds changes to the zuko.flows.mixture.GMM
class, which allow the user to change the type of the covariance matrix used for each of the Gaussian components of the mixture.
The options added are
-
covariance_type
, which allows to change the type of the covariance matrices -
tied
a switch which allows to control if covariance matrices are tied between components -
cov_rank
the rank of the low-rank covariance matrix whencovariance_type
is 'lowrank'
Since the construction of the shapes got quite long I moved this part in its own function.
Below is an illustration of the effect these different choices have for a mixture of 3 two-dimensional Gaussians.
Hello @dominik-strutz, I quickly went over the code, and it looks nice! I think we should add some tests however, maybe in a new tests/test_flows_gmm.py
file.
Are you also planning to improve the initialization as well?
Hi @francois-rozet, Yes, I am happy to write some tests.
I'm also happy to try to improve the initialization. Following sklearn
again, the initialization methods 'random' and 'random_from_data' (the latter one working quite well in my limited experience) should be easy enough to implement. Also, implementing 'k-mean' or 'k-means++' should be manageable. I don't know how to handle the conditional case yet, but having only the unconditional one would be a good first start.
What is your opinion on how to structure the initialization? I think it would be beneficial to keep the __init__
method quite simple and have a separate method (e.g., GMM.initialization) that takes user-provided data samples and sets the phi variable or the last layer of the network to reasonable values given an initialization method.
I will give the initialisation a try and let you know how it goes.
P.S: I have no idea why the pre-commit hook fails. I used ruff --fix
locally to reformat, and from what I can see, it fulfils the requirements.
I don't know how to handle the conditional case yet, but having only the unconditional one would be a good first start.
I think a good way to handle the conditional case would be to make the weight $W$ of the last layer small (e.g. standard initialization * 1e-2) and set the bias $b$ to the unconditional initialization.
What is your opinion on how to structure the initialization?
I agree that a separate method could be appropriate, similar to the reset_parameters
of nn.Linear
.
P.S: I have no idea why the pre-commit hook fails. I used ruff --fix locally to reformat, and from what I can see, it fulfils the requirements.
I pulled your branch and ruff check .
at the root of the repo returns
zuko/flows/mixture.py:7:1: I001 [*] Import block is un-sorted or un-formatted
Found 1 error.
[*] 1 fixable with the `--fix` option.
Maybe you were not at the root? My version of ruff is 0.1.14
by the way.
@dominik-strutz Do you still plan on contributing this PR?
Yes, I still like to contribute but haven't found much free time to do it recently. I have implemented most of the initialization methods for the unconditional case, but it still needs to be polished up and tested. The extension for the conditional case shouldn't take too long afterwards. If you or someone else wants to continue this sooner, I'm happy to push an intermediary commit of everything I have so far.
No problem, take your time! I am currently updating a few things and wanted to know if I should wait for this PR for the next minor release.