bkpgr35

Results 9 comments of bkpgr35

My fix would be like this, quite straight forward. But I have strong concern that this change will break original logic of the source code. ``` if activation.__name__ != 'linear':...

@Sushant-aggarwal Hi, I have met such issue and solved it. My case is that the model is dynamic input size, so make sure your model input size is fixed. Hope...

I deleted tensorflow-gpu and reinstalled tensorflow so without gpu, everything works properly. I think, maybe some gpu settings are not correct.......

@divamgupta Hi, sorry for the late reply, I am using tensorflow 1.11.0, and GPU memory is free.

The upsample layer can be created, but every time when trying to execute the upsample layer, it will crush. I don't know where I did wrong. Anyone has met similar...

你好,这是截图,我是双卡训练,split分发数据有问题,应该是最后一个step分发的数据不平均导致的 ![微信图片_20210909162149](https://user-images.githubusercontent.com/35348196/132650637-4bf670d3-04b6-4d6e-9b1b-0e7185e743a2.png)

你好,我已经解决了这个问题,就是最后一个step不满batch_size的话可能会出问题,比如batch_size恰好不是偶数的话,split到两个device会出错,解决的方法也很简单,把[这里](https://github.com/huanghuidmml/tfbert/blob/master/run_ner.py#L112)改成True就行了。不过我个人认为数据分发的逻辑还是有些问题的,test阶段不应该把数据分发到多卡上,而是一张卡做test,其他卡不做任何操作。 @huanghuidmml 大佬你觉得呢?

@InstantWindy Hi I think you misunderstood the paper. Fig. 2 in paper doesn't mean shrinked image will be input directly, watch out the little words in the pic "share weights...

@MathiasGilson I guess the reason is because he want to rescale to range from 0-255 to -0.5-+0.5. The intuition behind this is similar to batchnormalization, which helps training to converge...