SARTHAK JAIN
Results
2
issues of
SARTHAK JAIN
Thanks for the great blog post, It was quite helpful for keeping track of what's happening in the paper. Currently in the section "Discrete-time SSM: The Recurrent Representation" in the...
Hi I was wondering if there was a way to turn the dropout and layer-norm layers in BERT to eval mode during training when we set the requires_grad parameter to...
Contributions welcome