examples icon indicating copy to clipboard operation
examples copied to clipboard

Testing a C++ case with MPI failed.

Open alamj opened this issue 1 year ago • 1 comments

šŸ› Describe the bug

I am testing the following example:

https://github.com/pytorch/examples/blob/main/cpp/distributed/dist-mnist.cpp

I get the following error:

[ 50%] Building CXX object CMakeFiles/awcm.dir/xdist.cxx.o /home/alamj/TestCases/tests/xtorch/xdist/xdist.cxx:1:10: fatal error: c10d/ProcessGroupMPI.hpp: No such file or directory 1 | #include <c10d/ProcessGroupMPI.hpp>

I changed the top line with full path to ensure that hpp file gets available #include </project/def-alamj/shared/libtorch/include/torch/csrc/distributed/c10d/ProcessGroupMPI.hpp>

The new error indicates something else I need to know, which is given in the tutorial.

[ 50%] Building CXX object CMakeFiles/awcm.dir/xdist.cxx.o /home/alamj/TestCases/tests/xtorch/xdist/xdist.cxx:38:21: error: ā€˜c10d’ was not declared in this scope; did you mean ā€˜c10’? 38 | std::shared_ptrc10d::ProcessGroupMPI pg, | ^~~~ | c10

Please let me know how do I get a work around to fix this.

Error logs

No response

Minified repro

No response

Versions

I think this field is not needed as I am running C++ code.

cc @ezyang @msaroufim @bdhirsh @anijain2305 @zou3519

alamj avatar Feb 25 '24 19:02 alamj