stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

Add missing support for linear activation in hypernetwork

Open benkyoujouzu opened this issue 3 years ago • 6 comments

In hypernetwork, the linear activation_func that the old implementation used is missing from ui.

Add the linear activation in hypernetwork back to ui.

benkyoujouzu avatar Oct 26 '22 13:10 benkyoujouzu

Sorry for inconvenience, I totally forgot it.

aria1th avatar Oct 26 '22 13:10 aria1th

Just a heads-up: unless I am mistaken, from my testing of this branch, it seems the changes that remove activation_func=None (38, 52) will cause prior trained hypernets to be unable to load.

RuntimeError: hypernetwork uses an unsupported activation function: None

WebDev9000 avatar Oct 28 '22 00:10 WebDev9000

Just a heads-up: unless I am mistaken, from my testing of this branch, it seems the changes that remove activation_func=None (38, 52) will cause prior trained hypernets to be unable to load.

RuntimeError: hypernetwork uses an unsupported activation function: None

Sorry for that... I only test the new hypernetwork created with 'linear' activation before last update. Now the old hypernetworks should also work.

benkyoujouzu avatar Oct 28 '22 03:10 benkyoujouzu

Old hypernetworks still do not work

raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for HypernetworkModule:
        Missing key(s) in state_dict: "linear.2.weight", "linear.2.bias".
        Unexpected key(s) in state_dict: "linear.1.weight", "linear.1.bias".

And hypernetwork create by this commit can not work on latest master commit:737eb28faca8be2bb996ee0930ec77d1f7ebd939

   raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for HypernetworkModule:
        Missing key(s) in state_dict: "linear.1.weight", "linear.1.bias", "linear.3.weight", "linear.3.bias".
        Unexpected key(s) in state_dict: "linear.4.weight", "linear.4.bias", "linear.6.weight", "linear.6.bias".
        size mismatch for linear.2.weight: copying a param with shape torch.Size([3072, 1536]) from checkpoint, the shape in current model is torch.Size([1536, 3072]).
        size mismatch for linear.2.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1536])

the keys of state_dict should be like: linear.0.weight linear.0.bias linear.2.weight linear.2.bias linear.4.weight linear.4.bias linear.5.weight linear.5.bias

but in this commit keys of state_dict are linear.0.weight linear.0.bias linear.2.weight linear.2.bias linear.4.weight linear.4.bias linear.6.weight linear.6.bias

nekoyama32767 avatar Oct 28 '22 04:10 nekoyama32767

Old hypernetworks still do not work

raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for HypernetworkModule:
        Missing key(s) in state_dict: "linear.2.weight", "linear.2.bias".
        Unexpected key(s) in state_dict: "linear.1.weight", "linear.1.bias".

And hypernetwork create by this commit can not work on latest master commit:737eb28faca8be2bb996ee0930ec77d1f7ebd939

   raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for HypernetworkModule:
        Missing key(s) in state_dict: "linear.1.weight", "linear.1.bias", "linear.3.weight", "linear.3.bias".
        Unexpected key(s) in state_dict: "linear.4.weight", "linear.4.bias", "linear.6.weight", "linear.6.bias".
        size mismatch for linear.2.weight: copying a param with shape torch.Size([3072, 1536]) from checkpoint, the shape in current model is torch.Size([1536, 3072]).
        size mismatch for linear.2.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([1536])

Now the default behaviour should be the same as commit:737eb28faca8be2bb996ee0930ec77d1f7ebd939. Let's check if this work on your old hypernetworks.

benkyoujouzu avatar Oct 28 '22 05:10 benkyoujouzu

Now it is working on my old hypernetworks , but can not work with hypernetwork created by3302fbd. But It's all right, I will retrain my hypernetwork

nekoyama32767 avatar Oct 28 '22 05:10 nekoyama32767

Okay I checked the code, apparently "linear": torch.nn.Identity, would add additional Linear layer instead of leaving it empty, which will break old HNs. If Linear was created with that function, that might be problem... so I'll somehow find a way to make both of them work.

aria1th avatar Oct 28 '22 23:10 aria1th

In https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/3771 https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/3771/commits/f361e804ebaa5af4a10711ece2522869fb64a4c6 fixes it. @nekoyama32767 @benkyoujouzu if keyError is happening : it's because HN were created without skipping linear, somehow without this line if activation_func == "linear" or activation_func is None: pass Its not normal in main branch, so forcing Identity won't be supported.

aria1th avatar Oct 28 '22 23:10 aria1th