grok-1 icon indicating copy to clipboard operation
grok-1 copied to clipboard

Add Exceptions to LanguageModel and question on DenseBlock Impl

Open devindkim opened this issue 11 months ago • 2 comments

Adding some exception handling to LanguageModel and added a comment around the implementation of DenseBlock. Usually the widening -> gelu -> projection is sequential but this implementation isn't, I'm curious whether this is an intentional detail?

devindkim avatar Mar 18 '24 22:03 devindkim

I am kinda curious.. Does this degrade model performance?

snapsl avatar Mar 22 '24 16:03 snapsl

how the turn tables

devindkim avatar Apr 06 '24 01:04 devindkim