Jose

Results 37 comments of Jose

In my fork there is now support for Batch Normalization in any SAEHD architecture. When executing the program should ask 'Use BN' at some point. Just indicate 'y' and the...

Aparently the loading of previous models without BN didn't give any problems as I thought. So the PR is free to go.

I got the exact same error, use the latest commit, it should be fixed.

That is new. MemoryError is typically given when loading large files. Try to create, save and load new models to see if the error persists.

Okay, so apparently there was a problem because the current implementation of the DFL considers trainable and non-trainable weights the same. That was making the `running_mean` and `running_var` being trained...

VRAM should increase, but not much. The mean and variance only have channel dimension. Maybe the gradient with respect to the co variance shift is at fault here. Although again,...

What you mention would not work, if a layer is unfrozen, all the layers after it should also be. The reason for that is neural network layers build on top...

`graph.py` Line 816-817 suggest that diff in fact means different, like O != A.

If you add the eq2_triangle definition to `defs.txt` without adding an extra breakline (\n) at the end you get this: ``` Traceback (most recent call last): File "/home/user/anaconda3/envs/alphageo/lib/python3.10/runpy.py", line 196,...

Yes, I understand is not a trivial task. But having that set of examples and finetuning GPT-4 with chain of thought and some very detailed system instructions could work. The...