Multi-layer structure hyper network and LR scheduler
Supports previously trained hypernetworks (1x -> 2x -> 1x simple networks)
Tested creating / training hypernetwork with [1, 2, 2, 1] argument. Tested training from existing hypernetwork named 'anime'
Does not implement gradio frontends.
Complex structures might improve hypernetwork performance drastically for large datasets, despite hypernetwork cannot train inductive bias to original models.
Multi layer structure
as defined at
(HypernetworkModule(size, multipliers=[1, 2, 1]), HypernetworkModule(size, multipliers=[1, 2, 1]))
Its 1x -> 2x -> 1x parameter Fully Connected (Linear, or 'dense') layer connection.
It can be changed with [1, 2, 4, 2, 1] (or getitem supported objects) for more layers.
Dropout is not supported : Its suggested to implement torch.load instead for more complex networks.
LR scheduler from CosineAnnealingWarmRestarts and Exponential scheduling
ExponentialLR(optimizer, gamma = 0.01 ** (1 / steps))
at 100k epoch, lr will be reduced to 0.01x.
CosineAnnealingWarmRestarts will try to increase and decrease learning rate, sometimes its useful to avoid local minima.
It should be totally better to use torch.load for supporting complex hypernetworks and dropouts...