Multi-Modal Based Truncation
Hi!Thank you for your excellent work and summary!
I would like to know how to use multi-modal Based Truncation.
Thank you! Multi-modal truncation code is a skeleton for now (gotta make it nicer so you can use your models), but it works like so:
-
network_pklis the path or URL to your saved model or the model you want to use -
descriptionwill be the description of the folder that will be created inoutdir, where the images will be saved at -
num_clusterscan be changed to whatever you want, but the paper says 64 is good enough
So basically we generate 60,000 random dlatents that will be standardized and then clusterized with KMeans. I have verbose=1 in the KMeans algo, but you can set it to 0 to stop printing stuff. This will take some minutes as you use the CPU, but perhaps there's a possibility of using the GPU in the future.
In the end, 3 images are saved:
- A grid with: the pure dlatent that uses the
seedat the beginning, the global average image orw_avg, and the pure dlatent truncated towards this global average - A grid of all the 64 (or
num_clustersyou have used) that are found in the latent space W - The pure dlatent truncated towards each of these 64 new centers
Also, the 64 new centers are saved as .npy files, so you can use them in other parts of the code. For example, generate.py lets you use a new center to use instead of w_avg by passing --new-center' and then the path to any of these 64 new centers/clusters.
Let me know if you have more questions, but this code needs to be redone/make prettier so it's easier to use and clearer as well.