Jared Casper
Jared Casper
Just to clarify, I believe the only difference between the fused layer norm in Megatron's code vs APEX is in the types, but that could lead to pretty different performance...
I don't have the `B5cxx11` in the symbol in my binary. Is the symbol the same without the CXX11_ABI define? For me that symbol is defined in the `_pywrap_tensorflow.so` library:...
Are you accounting for the fact that the "reserved" symbol defaults to `0` for the `warpctc_tensorflow.ctc` op and the highest symbol value for the `tf.nn.ctc_loss` op? To switch between the...
gcc5 is not supported in CUDA 7.5. Will you try again with CUDA 8.0? It appears CUDA 8.0 supports Ubuntu 16.04 with gcc 5.3.1.
@starimpact can you check that you are using the latest CUDA 8.0.44 and that `-gencode arch=compute_61,code=sm_61` is part of the nvcc command line when you build? (maybe try starting with...
The API on the master branch changed and mxnet looks to be using the old API, to maintain compatibility you can use the v1 tag/branch which will maintain the old...
Can you give more details such as which OS you are on (i.e. Ubuntu version, etc.) and which CUDA version you are using? Have you modified the source in any...
Can you post the nvcc command line that it is using the try to compile ctc_entrypoint.cu (you can get it show you using `make VERBOSE=1`. Just wondering if cmake is...
I suspect the problem comes from using gcc 6.1 which is not supported by cuda. For cuda 7.5 on CentOS 7.x, only gcc 4.8.2 is supported. I don't think gcc...
Looks okay to me, only suggestion may be to only add the flag if g++ version 5+ is being used. However, I no longer have write access to this repo...