FasterTransformer icon indicating copy to clipboard operation
FasterTransformer copied to clipboard

maybe some advices for cmake

Open jinluyang opened this issue 3 years ago • 6 comments

1,I think in FindNCCL.cmake, the "set(NCCL_INCLUDE_DIR $ENV{NCCL_INCLUDE_DIR} CACHE ...)" should better be surounded by a "if (DEFINED ENV{...} " thing, to avoid the variable in the cache to be set to "" when the env var is not set. in such cases when afterwards the variable is set, the "null" variable in the cache still works 2,when building a pytorch version ,setting BUILD_GPT=OFF doesn't work, maybe because the gpt.h still has to be compiled. 3, line 63 of fused_multihead_attention_op.cc, The rank of from tensor should be 2, not 3

jinluyang avatar Apr 27 '21 09:04 jinluyang

and it seems some cmake files lack -lmpi_cxx, so there are some compile errors

jinluyang avatar Apr 29 '21 08:04 jinluyang

We will solve the issue 2 and 3 first. And for issue 1, we will take some time to check. Do you encounter any problem due to lacking -lmpi_cxx?

byshiue avatar Apr 29 '21 09:04 byshiue

We will solve the issue 2 and 3 first. And for issue 1, we will take some time to check. Do you encounter any problem due to lacking -lmpi_cxx?

yes, I encountered a "undefined reference to ompi_mpi_cxx_op_intercept "etc. , just like: https://stackoverflow.com/questions/17548436/openmpi-error-during-compilation. with -DBUILD_GPT=ON -DBUILD_TF=ON,and set the LD_LIBRARY_PATH, sample/cpp/gpt_sample.cc throws that error

jinluyang avatar Apr 29 '21 09:04 jinluyang

We will solve the issue 2 and 3 first. And for issue 1, we will take some time to check. Do you encounter any problem due to lacking -lmpi_cxx?

yes, I encountered a "undefined reference to ompi_mpi_cxx_op_intercept "etc. , just like: https://stackoverflow.com/questions/17548436/openmpi-error-during-compilation. with -DBUILD_GPT=ON -DBUILD_TF=ON,and set the LD_LIBRARY_PATH, sample/cpp/gpt_sample.cc throws that error

Do you run on different environment? You can propose a pull request to fix the bug since we can reproduce it in the docker we recommend.

byshiue avatar Apr 29 '21 09:04 byshiue

We will solve the issue 2 and 3 first. And for issue 1, we will take some time to check. Do you encounter any problem due to lacking -lmpi_cxx?

yes, I encountered a "undefined reference to ompi_mpi_cxx_op_intercept "etc. , just like: https://stackoverflow.com/questions/17548436/openmpi-error-during-compilation. with -DBUILD_GPT=ON -DBUILD_TF=ON,and set the LD_LIBRARY_PATH, sample/cpp/gpt_sample.cc throws that error

Do you run on different environment? You can propose a pull request to fix the bug since we can reproduce it in the docker we recommend.

Yes, I run on a different environment, my mpi is openmpi 1.8.5. Alright, I get it~

jinluyang avatar Apr 29 '21 09:04 jinluyang

and it seems some cmake files lack -lmpi_cxx, so there are some compile errors

Thanks for your advise! I met this error and fixed it by adding -lmpi_cxx -lmpi to target_link_libraries in CMakeLists.txt and sample/cpp/CMakeLists.txt

zhanzy178 avatar Nov 02 '21 12:11 zhanzy178

Close this bug because it is inactivated. Feel free to re-open this issue if you still have any problem.

byshiue avatar Sep 06 '22 01:09 byshiue

I made the changes suggested by @zhanzy178 at multiple places and I was able to build without any erros. Thanks @zhanzy178 for the inputs.

karthikiitm87 avatar Aug 08 '23 14:08 karthikiitm87