Rui Wang

Results 37 comments of Rui Wang

Hi, This code is specialized for inference only. As you can see, we have tried really hard to make it run under moderate GRAM requirement, at least for inference. During...

Hi @ptrendx, we used both `pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable` and `pip install git+https://github.com/NVIDIA/TransformerEngine.git@main` and tried python version from 3.9 to 3.11. Everytime we simply install `pytorch==2.0.1` and `packaging` and then ran...

Hi @ptrendx, after a little digging, we think we have located the problem but not sure what's the solution here: ``` /usr/bin/c++ -Dtransformer_engine_EXPORTS -I/home/rui/TransformerEngine/transformer_engine -I/home/rui/TransformerEngine/transformer_engine/common/include -I/usr/local/cuda-11.8/targets/x86_64-linux/include -I/home/rui/TransformerEngine/transformer_engine/../3rdparty/cudnn-frontend/include -I/tmp/tmp9cj2vyni/common/string_headers -isystem /usr/local/cuda-11.8/include...

Hi, Some updates, our machines with H800 can successfully install now but A100 machines cannot yet. H800 machines just needed CUDNN but A100 machines, even after installation of CUDNN, still...

Hi, yes it is in `/usr/local/cuda-11.8/include` and it seems that `__half2ushort_rz` is declared there.

Hi, @MicPie , We have been able to install this with newer commits now. Were you trying on stable releases?

Hi, You would have to modify setup.py and make it output the actual error message (maybe by manual input of commands in terminal) s.t. we can know exactly what is...