KevinKune

Results 8 issues of KevinKune

我认为直接使用Xavier初始化网络参数会有一点问题,因为网络的真实输入输出并不是[2*embed_size, 1],而是[n^2*embed_size, n^2/2],从这个角度来看,为了保证输出与梯度的分布接近标准正态,应该采用3维的xavier [n^2/2, 2*embed_size, 1]来进行初始化。当然,从网络的角度来说,输入输出还是[2*embed_size, 1],因此glorot的方法可能并不完全适用,更加合理的初始化可能介于2*embed_size, 1]与[n^2/2, 2*embed_size, 1]之间,目前的初始化方法容易nan可能是因为方差过大导致的,这一点可以通过调节问题粗暴控制,但是如果在最开始的几个step就产生梯度爆炸的话,调节温度也救不回来

非常感谢分享这份代码, 能否解释一下第134和143行中, dropout_keep这个参数在init_graph的时候直接传入tf.nn.dropout

AFM训练起来很容易nan,请问您遇到过这种情况吗?对于调参数有什么建议?哪些参数比较敏感?

Hi, thanks for releasing the paper & code. I have tried the IMDb text classification task and UDA achieved quite promising improvements. Will you release your models and datasets for...

Thanks for sharing the code. Your gradient accumulation implementation helps me a lot on my datasets (roughly >10% f1 improvements with very large batch size). Please check line 87 of...

Hi, I've heard of this strong model which can learn atomic coordinates. Now I want to adapt this model for my project, but I find the code is a bit...

Dear authors, I followed your work and use PoseCheck to calculate clash and strain energy since last year. Recently I pulled your update and re-calculated those numbers and found a...

Hi: Recently I've been using PoseCheck to evaluate clash, strain energy and key interactions for SBDD models. And I found the following unexpected behaviors: 1. When loading CrossDocked test set...