Jurian Baas

Results 2 comments of Jurian Baas

It seems to me that by including the learning rate, the squared gradient is smaller (a smaller number is squared). Therefore the updates will not shrink in size as fast...

This is a very interesting method and I would like to use it in [my project](https://github.com/Jurian/graph-embeddings). At the moment I am using ngrams to create boolean vectors, perhaps this works...