Pruthvi Madugundu
Pruthvi Madugundu
This problem updates the the PR [#73040](https://github.com/pytorch/pytorch/pull/73040) The compilation error in pyTorch with ROCm is successful with these changes when `NDEBUG` is enabled. Solution: For HIP we keep `__device__ __assert_fail()`...
cc @jeffdaily @sunway513 @jithunnair-amd @ROCmSupport
cc @jeffdaily @sunway513 @jithunnair-amd @ROCmSupport
- Removes hard coding and helps in internal builds cc @jeffdaily @sunway513 @jithunnair-amd @ROCmSupport @dllehr-amd @jataylo @hongxiayang
- Build is currently enabled for only hip_basic(cuda_basic) - Sample build command for reference `cmake ../ -DCMAKE_C_FLAGS="-Werror -Wno-deprecated-declarations -D__HIP_PLATFORM_HCC__=1" -DCMAKE_CXX_FLAGS="-Werror -Wno-deprecated-declarations -D__HIP_PLATFORM_HCC__=1" -DTP_ENABLE_SHM=OFF -DTP_ENABLE_CMA=OFF -DTP_USE_ROCM=ON -DTP_ENABLE_HIP_XTH=OFF -DTP_ENABLE_HIP_IPC=OFF -DTP_ENABLE_HIP_GDR=OFF -DTP_ENABLE_IBV=OFF -DTP_BUILD_TESTING=ON`
- Add Hipify as a git submodule - Trigger hipify from cmake build - TP_USE_ROCM controls the trigger, which will be set to ON when building on ROCm
- Changes to control hipify of CUDA_VERSION to HIP_VERSION - use GLOO_USE_ROCM instead of __HIP_PLATFORM_HCC__ - Adding __HIP_PLATFORM_AMD__ since __HIP_PLATFORM_HCC__ is being deprecated.
### 🐛 Describe the bug distributed/rpc/cuda/test_tensorpipe_agent | test_async_execution_nested_with_cuda_future | (__main__.TensorPipeTensorPipeAgentCudaRpcTest) distributed/rpc/cuda/test_tensorpipe_agent | test_async_execution_with_cuda_future | (__main__.TensorPipeTensorPipeAgentCudaRpcTest) distributed/rpc/cuda/test_tensorpipe_agent | test_basic_gloo_ckpt_always | (__main__.TensorPipePipeWithDDPTest) distributed/rpc/cuda/test_tensorpipe_agent | test_basic_gloo_ckpt_except_last | (__main__.TensorPipePipeWithDDPTest) distributed/rpc/cuda/test_tensorpipe_agent | test_basic_gloo_ckpt_never | (__main__.TensorPipePipeWithDDPTest)...
### 🐛 Describe the bug test_batchnorm_cudnn_nhwc | (__main__.TestNN) -- | -- test_conv_backend_cudnn1d_has_bias_False_strided_False_contiguous_False_cuda | (__main__.TestNNDeviceTypeCUDA) test_conv_backend_cudnn1d_has_bias_False_strided_False_contiguous_True_cuda | (__main__.TestNNDeviceTypeCUDA) test_conv_backend_cudnn1d_has_bias_False_strided_True_contiguous_False_cuda | (__main__.TestNNDeviceTypeCUDA) test_conv_backend_cudnn1d_has_bias_False_strided_True_contiguous_True_cuda | (__main__.TestNNDeviceTypeCUDA) test_conv_backend_cudnn1d_has_bias_True_strided_False_contiguous_False_cuda | (__main__.TestNNDeviceTypeCUDA) test_conv_backend_cudnn1d_has_bias_True_strided_False_contiguous_True_cuda | (__main__.TestNNDeviceTypeCUDA) test_conv_backend_cudnn1d_has_bias_True_strided_True_contiguous_False_cuda...