Jose comments

Results 37 comments of


                                            Jose

Why there isn't an option for BatchNormalization?

In my fork there is now support for Batch Normalization in any SAEHD architecture. When executing the program should ask 'Use BN' at some point. Just indicate 'y' and the...

Why there isn't an option for BatchNormalization?

Aparently the loading of previous models without BN didn't give any problems as I thought. So the PR is free to go.

Why there isn't an option for BatchNormalization?

I got the exact same error, use the latest commit, it should be fixed.

Why there isn't an option for BatchNormalization?

That is new. MemoryError is typically given when loading large files. Try to create, save and load new models to see if the error persists.

Why there isn't an option for BatchNormalization?

Okay, so apparently there was a problem because the current implementation of the DFL considers trainable and non-trainable weights the same. That was making the `running_mean` and `running_var` being trained...

Why there isn't an option for BatchNormalization?

VRAM should increase, but not much. The mean and variance only have channel dimension. Maybe the gradient with respect to the co variance shift is at fault here. Although again,...

Freezing layers

What you mention would not work, if a layer is unfrozen, all the layers after it should also be. The reason for that is neural network layers build on top...

Definitions clarification

`graph.py` Line 816-817 suggest that diff in fact means different, like O != A.

prove of the simple problem of Fig. 1

If you add the eq2_triangle definition to `defs.txt` without adding an extra breakline (\n) at the end you get this: ``` Traceback (most recent call last): File "/home/user/anaconda3/envs/alphageo/lib/python3.10/runpy.py", line 196,...

Suggestion: Publish original text of jgex_ag_231.txt

Yes, I understand is not a trivial task. But having that set of examples and finetuning GPT-4 with chain of thought and some very detailed system instructions could work. The...