Zach Nussbaum comments

Results 50 comments of


                                            Zach Nussbaum

question about the result

Where are the negative samples even used for Bert? I don't see it referenced anywhere in the code other than being initialized.

question about the result

How is the sampling so much different? ```For easy and fair evaluation, we follow the common strategy in [12, 22, 49], pairing each ground truth item in the test set...

EOFError after Pip install

`conda create -n pluribus` `conda activate pluribus` `pip install poker_ai` `poker_ai` I don't have a ton of experience with conda, but still getting the same error when doing above.

EOFError after Pip install

hm i'll test it out. either way, i'm excited to try it out!

Conv1D Coord check looks good (I think), but μTransfer does not seem to work?

Thanks Edward for the suggestions! Quick clarification, can you expand on what you mean by `reason if it's expected`? Are there situations where that wouldn't be the case? And yes...

Conv1D Coord check looks good (I think), but μTransfer does not seem to work?

@shjwudp I interpreted that chart as one that showed the benefits of muP given that at increasing depth, the HPs do actually provide better performance whereas SP has a shifting...

Conv1D Coord check looks good (I think), but μTransfer does not seem to work?

@edwardjhu thanks for this explanation, I think I missed this part in the paper but makes intuitive sense to me. Is there section in the paper that describes how `mup`...

Conv1D Coord check looks good (I think), but μTransfer does not seem to work?

@edwardjhu I did not realize the keys of the models dict was their width, so the plots look a little different when I make that change `mup` ![coord_conv_mup](https://user-images.githubusercontent.com/33707069/182948996-ea015e67-14ed-4921-b729-519aa8bd38dd.png) `sp` ![coord_conv_sp](https://user-images.githubusercontent.com/33707069/182949040-b6e89981-595b-498a-b63c-f472b2d2ea7a.png)...

Conv1D Coord check looks good (I think), but μTransfer does not seem to work?

ah thanks so much! I missed that in my first few passes

Conv1D Coord check looks good (I think), but μTransfer does not seem to work?

@edwardjhu this is a somewhat silly question but wanted to double check. When we are transferring parameters, should we retain the `mup` Readout Layers or resort to the SP layers...