AngelBottomless

Results 124 comments of AngelBottomless

@enn-nafnlaus Typically dropouts [are applied after activation,](https://sebastianraschka.com/faq/docs/dropout-activation.html) But LayerNorm (or norms)[ does not matter](https://blog.paperspace.com/busting-the-myths-about-batch-normalization/), **actually in practice, its being used in both way.** I'd say that both are practical. Actually...

Dropout ratio highly depends on representation of our data. If we assume if we have shallowly- decomposed latent space, dropout ratio should be bigger. If we assume very critical -...

Currently its directory order. Rather than making another dict, sort result of glob.iglob instead [here](https://github.com/Omegastick/stable-diffusion-webui/blob/ordered_hypernetworks/modules/hypernetworks/hypernetwork.py#L222) Simply you can cover it with `sorted()`, or `sorted(iterable, key = func)`

You mentioned it should be sorted in steps order, but I found that it won't be sorted if its digits are different. Assume you have `a-4000.pt` and `a-39000.pt`. The sorted...

Can you specify python version you're using? This is option from [python 3.10](https://peps.python.org/pep-0604/)

@captin411 Currently no, you have to set values sum to 1.0. I'm not even getting any proper results with non-1 sum, so I'll add auto normalization very soon. Within 20...

![image](https://user-images.githubusercontent.com/35677394/200156311-3d8ede91-631c-484b-b776-5b06dc144696.png) ![image](https://user-images.githubusercontent.com/35677394/200161555-2974cf6d-54f5-43ef-8c22-91aa9a744442.png) Note : HNs are tested in strength 0.45, clip skip 2 I didn't put nsfw prompt, but for safety I censored it.

Reference images are at[ here](https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/2670#discussioncomment-4061517)

Xavier Normal and Uniform has working case, but yes I agree we need additional scaling factor that can control its magnitude. If you're interested, see [this colab ](https://colab.research.google.com/drive/1Zxe53p8ICEQ6YelpTA1YI5rr0wkgyqY5#scrollTo=POj6n9PZzRqC) for analysis....

No, weight initialization(kaiming>xavier>normal) has bigger standard deviation. Weight initialization means we does not start from zero change, we start from slightly 'far' from origin, and try to search the 'lowest...