taichi icon indicating copy to clipboard operation
taichi copied to clipboard

Developer installation notes

Open ruro opened this issue 3 years ago • 13 comments

I had some problems while trying to get a Developer environment up and running. I was able to figure it out myself in the end, but here is a short walkthrough of the issues I ran into. Hopefully, my notes will help anyone, who runs into the same issues as me.


  1. Currently on Arch Linux the default python version is 3.10, which is not supported by taichi.
    • Install python39 from AUR

    • I highly recommend using a virtual environment so that the "plain" pip and python commands point to the 3.9 versions instead of the system ones. Like this:

      python3.9 -m venv --upgrade-deps .venv
      . .venv/bin.activate
      pip install -r requirements_dev.txt
      pip install -r requirements_test.txt
      

  1. Use the pre-compiled Clang and LLVM binaries provided on this page.

    Note that there are 2 download links for pre-built binaries on that page, and you need BOTH OF THEM! In retrospect, this is really obvious, but I missed the explanation about the customized taichi-specific LLVM binaries and only downloaded the Clang+LLVM archive, assuming that the LLVM version that is bundled with Clang will just work. It will not!

    If you get errors about undefined reference to `typeinfo for llvm::CallbackVH', then you probably made the same mistake as me. Or you didn't configure your PATH properly. Or you are compiling LLVM from sources instead of using the precompiled binaries, and you forgot to enable RTTI (-DLLVM_ENABLE_RTTI:BOOL=ON).

    • Download Clang + LLVM 10.0.0 pre-built binary for Ubuntu 18.04.

    • Download LLVM 10.0.0 for Linux.

    • Extract both of them somewhere (I extracted them straight into the root of the taichi repo for convenience).

    • Create a shell script that prepares your environment (I named it env.sh)

      export CLANG_PATH="$(realpath path/to/extracted/clang+llvm-10.0.0-x86_64-linux-gnu-ubuntu-18.04/)"
      export LLVM_PATH="$(realpath path/to/extracted/taichi-llvm-10.0.0-linux/)"
      
      . .venv/bin/activate
      export TAICHI_CMAKE_ARGS="-DCMAKE_CXX_COMPILER=${CLANG_PATH}/bin/clang++ $TAICHI_CMAKE_ARGS"
      export PATH="${LLVM_PATH}/bin:${CLANG_PATH}/bin:$PATH"
      
    • Activate the combined environment

      source env.sh
      


  1. If you get compiler errors about clang failing to "compile a simple test program" due to missing libtinfo.so.5, then you need to install ncurses5-compat-libs from AUR (or your system equivalent of ncurses version 5).

  1. If you get compiler errors about variable length array declaration not allowed at file scope and MINSIGSTKSZ then you need to update the catch.hpp external library.

    • This issue is already fixed in newer versions of catch.hpp, so hopefully taichi will eventually upgrade their version.
    • For now you can download it from here (direct link to catch.hpp) manually and place it into external/include/catch.hpp.

  1. If you get a lot of tests failing due to CUDA_ERROR_OUT_OF_MEMORY, reduce the number of threads used for tests:
    • TI_TEST_THREADS=1 python tests/run_tests.py ...

ruro avatar Mar 12 '22 09:03 ruro

It's really impressive to see you sort out everything on your own! We don't use arch linux often here so the docs for it is indeed imperfect. Reopening this issue temporarily to make sure we update dev install for arch linux before v1.0 release, @RuRo please feel free to contribute if you're interested!

ailzhang avatar Mar 16 '22 09:03 ailzhang

FYI, Python 3.10 will be supported as soon as by next release. Current source should be possible to build with py310, please submit an issue if you encounter any problem. Thanks!

qiao-bo avatar Mar 16 '22 09:03 qiao-bo

Thank you, @RuRo, for the detailed note.

  1. For Python 3.10 support, we are working on it, see https://github.com/taichi-dev/taichi/issues/3985. I'm wondering what stops you from making 3.10 work - could you briefly doc it so that people can help? 3.10 support is definitely on our roadmap.
  2. Yes, you will need a customized LLVM. Do you mind adding this clarification to the documentation? It will help a lot of developers.
  3. You are right. You will actually find this on the user installation page. It's true that we should include something similar on the developer installation page. Could you open up a PR for this too?
  4. Thanks a ton for fixing it already!
  5. Yes, this is a common frequent... Adding this to the doc would be helpful too.

It's really impressive that you have overcome all these challenges :-)

yuanming-hu avatar Mar 16 '22 09:03 yuanming-hu

Hmm, LLVM keeps biting at us. I wonder if TVM has some better experience managing the LLVM dependencies..? (cc @masahi )

k-ye avatar Mar 17 '22 01:03 k-ye

I wonder if TVM has some better experience managing the LLVM dependencies

TVM can use distro-provided LLVM, so I don't recall our users complaining about their installation experience with TVM + LLVM. I remember Taichi uses clang to compile the runtime lib, but TVM doesn't depend on clang. So our llvm story is a bit simpler than taichi's.

That said, TVM has many #if TVM_LLVM_VERSION >= 110 etc to cope with many versions of LLVM with their annoying breaking API changes. We also have an active contributor from Qualcomm who makes sure that our LLVM-dependent code is compatible with the upstream LLVM. That helps a lot too.

I'd say, it is a matter of which of developers or users take the burden of llvm. If more development effort makes the life of users easier, I believe it is totally worth it. As taichi gets more and more users, maybe it is worth revisiting the dep on the custom llvm and clang?

masahi avatar Mar 17 '22 04:03 masahi

That said, TVM has many #if TVM_LLVM_VERSION >= 110 etc to cope with many versions of LLVM with their annoying breaking API changes.

Ha, I was wondering how TVM was able to cope with multiple LLVM versions...

If more development effort makes the life of users easier, I believe it is totally worth it. As taichi gets more and more users, maybe it is worth revisiting the dep on the custom llvm and clang?

Can't agree more. Actually, the original goal of releasing Taichi's prebuilt LLVM lib is to avoid the efforts of compiling LLVM with all the necessary flags.

IIUC, the problem with Taichi' clang + LLVM can be summarized as below:

  1. We need clang to compile part of our runtime into LLVM modules.
  2. We only support a specific version of LLVM as of now, which restricts the clang version in 1.

I can see two approaches here:

  1. Ship not only the pre-compiled LLVM lib, but also the complete compilation toolchain (mostly clang)
  2. Follow TVM and support multiple LLVMs. Then, for all the major clang versions in use, we can ask the users to install the matching LLVM lib.

I feel like the second option is the more standard approach here.

k-ye avatar Mar 17 '22 07:03 k-ye

That said, TVM has many #if TVM_LLVM_VERSION >= 110 etc to cope with many versions of LLVM with their annoying breaking API changes. We also have an active contributor from Qualcomm who makes sure that our LLVM-dependent code is compatible with the upstream LLVM. That helps a lot too.

That reminds me: Halide has a lot of LLVM guards too...

My personal feeling: supporting multiple LLVM versions can actually harm code maintainability.

We only support a specific version of LLVM as of now, which restricts the clang version in 1.

Perhaps the version of clang does not have to match that of LLVM, since we use clang to compile runtime.cpp and Taichi itself, which should give consistent results over different clang versions.

I would actually vote for the current way Taichi is handling LLVM: just support one single version.

yuanming-hu avatar Mar 17 '22 07:03 yuanming-hu

Perhaps the version of clang does not have to match that of LLVM, since we use clang to compile runtime.cpp and Taichi itself, which should give consistent results over different clang versions.

Previously we had this problem: users had clang-13 as their system default. The compiled runtime.cpp by clang-13 had compatibility issue with the LLVM-10 library.

+1 on harming the code maintainability.

k-ye avatar Mar 17 '22 08:03 k-ye

Hi, sorry for the delayed reply. Unfortunately, I am kind of busy lately, so I won't be able to contribute anything in the forceable future. I'll keep submitting issues as I run into them while using taichi and I might revisit them once I get some free time. No promises tho, so if anyone wants to contribute it, that would be great.

ruro avatar Mar 17 '22 20:03 ruro

Submitting high quality issues like this one is already a great help to us, thank you! :-)

k-ye avatar Mar 19 '22 04:03 k-ye

I'd support only one LLVM version but the current one needs an update, we should move to LLVM 12 or 13, so that we don't end up get stuck on 10 forever. (The later the switch the more costly the changes will be...) This can also help our performance and porting efforts by moving to newer LLVM

bobcao3 avatar Mar 20 '22 19:03 bobcao3

Yes, we should upgrade to LLVM 12. Our strategy is upgrading every year. Since LLVM now releases two major versions per year, we will adopt LLVM 12, 14, 16, 18 in the future.

Previous thread for upgrading to LLVM 10: https://github.com/taichi-dev/taichi/issues/655

yuanming-hu avatar Mar 21 '22 01:03 yuanming-hu

Yes, we should upgrade to LLVM 12. Our strategy is upgrading every year. Since LLVM now releases two major versions per year, we will adopt LLVM 12, 14, 16, 18 in the future.

Previous thread for upgrading to LLVM 10: #655

when? taichi dev is stopped...

johnnynunez avatar Mar 05 '25 13:03 johnnynunez