taichi
taichi copied to clipboard
Taichi CMake Overhaul
We would like to share our proposal for modernizing Taichi's CMake-based build system. By embracing the target-based approach, we can enforce a good modular design in our code base. This brings us benefits such as reduced (re-)compilation time. By being explicit and untangling the dependencies in current code base, code (target) that do not depend on others can be isolated and built independently (target level parallelism). Changes on one target would not incur the rebuild of others.
This is ongoing work. Previously some efforts have been spent on cleaning up the code base:
- #2196
- #2203
- #2195
Related issues
- #4882
Proposal
The proposed changes mainly consist of two parts: First, we would like to maintain a list of targets that Taichi uses and being built upon. Each of these targets should have its own CMakeList.txt
that specifies the build requirements as well as the usage requirements of this target. This may require some shuffling in our code base. Second, replacing many of our current CMake functions that have a global scope with target-based APIs. For example, include_directories
should be replaced with target_include_directories
to reduce hidden dependencies on header files to its minimum. Here, we share the outcome of some preliminary discussions:
Targets
Current Taichi C++ code base (https://github.com/taichi-dev/taichi/tree/master/taichi) could be split into the following build targets (names TBD):
-
program
: Core module that calls other targets in order. (https://github.com/taichi-dev/taichi/tree/master/taichi/program) -
ir
: Include ir, analysis, transform. This may depend ontype
,snode
etc. -
codegen
: Current location is (https://github.com/taichi-dev/taichi/tree/master/taichi/backends), We can further divide intoLlvmCodegen
andSpirvCodegen
.codegen
should depend onruntime
. -
runtime
: At the moment shares the same location ascodegen
. Need code shuffling to split out. -
artifact
: This information is required by bothcodegen
andruntime
. One example in the context of AOT is KernelAttributes: https://github.com/taichi-dev/taichi/blob/5b890f90f9edb90c2f1b4841ea7142f4b13e0bb2/taichi/backends/metal/kernel_utils.h#L105-L156 As a first step, we can move this information fromcodegen
toruntime
such thatruntime
does not depend oncodegen
. -
common
: Include logging, macro and such things. This target is shared among all targets. -
type
: Many targets may depend on this. -
snode
: Ideally we want one snode target that others can depend on. Currently, code are spread in ir, struct etc. We can have an snode builder such that an implemented snode no longer contains its constructor information. -
gui
: Taichi_core's peripheral targets. -
python
: Pybind related code. -
system
: Includesplatform
This dependency graph provides an overview of the mentioned targets:
API Changes
- Minimize the usage of global variables. Being explicit using targets. For example,
include_directories
->target_include_directories
,link_directories
->target_link_directories
, etc. Targets in sub-directories can be added byadd_subdirectory()
. - Differentiate between build and usage requirement of targets, use
private
for build requirement andinterface
orpublic
for usage requirements. For example,-Wall
is a build requirement not a usage requirement. - Replace these file glob APIs with explicit
target_source
function. https://github.com/taichi-dev/taichi/blob/5b890f90f9edb90c2f1b4841ea7142f4b13e0bb2/cmake/TaichiCore.cmake#L89-L103 File glob is not recommended in modern CMake. More importantly, avoid usingTI_WITH_XX
to guard the inclusion of source files. This leads to the pollution of manyTI_WITH_XX
in Taichi core target. Such as https://github.com/taichi-dev/taichi/blob/91d6f60f25abdf92fd1ec1d0de719bdcf9b82b6a/taichi/program/program.cpp#L78-L112
Implementation Roadmap
Phase one we would like to divide the current core target, namely taichi_isolated_core
, into a few major build targets including program
, codegen
, runtime
, ir
, python
.
https://github.com/taichi-dev/taichi/blob/91d6f60f25abdf92fd1ec1d0de719bdcf9b82b6a/cmake/TaichiCore.cmake#L244-L245
- [x] Split
runtime
from the backends dir. (Defineruntime
targets)- [x] cc
- [x] cpu
- [x] cuda
- [x] dx
- [x] interop
- [x] metal
- [x] opengl
- [x] vulkan
- [x] wasm
- [x] Split
codegen
from the backends dir. (Definecodegen
targets) - [ ] Define
ir
target. Resolve the dependencies onprogram
,backends
etc. - [x] Isolate Pybind related code.
Phase two we can further split the shared and peripheral targets from e.g. program
. This also includes artifact
and snode
. In addition, we can move header files to a separate taichi/include
directory. This allows us to distribute libraries with headers.
Tasks here will be continuously updated.
Cool! Could you also provide a dependency graph. https://excalidraw.com/ could be your friend (@ailzhang recommended)
Nice graph! IIUC, type
and snode
are sub-components under ir
? Also artifact
is probably too generic, better come up with a better naming for it... (Context: this refers to the artifacts/outcome from codegen)
IIUC,
type
andsnode
are sub-components underir
?
At the moment yes. Nevertheless, type
should not be an ir-specific component. Other targets such as codegen
can also depends on type
(and it does). Same for snode
. Ideally we want to have a type
as well as an snode
target.
Also
artifact
is probably too generic, better come up with a better naming for it... (Context: this refers to the artifacts/outcome from codegen)
agree, any suggestions?
Currently in taichi/backends
folder, apart from the code for runtime
and codegen
, we also have code for our unified device API, maybe we can make a backend
target for this part? Thus the dependencies would be something like:
runtime
->(depends on) backend
codegen
-> backend
codegen
-> runtime
program
-> backend
WDYT? @k-ye @ailzhang
Would be great to separate out the unified device API! I'd call it rhi
, though. @bobcao3
Discussion: We should later distinguish between public headers vs private headers (currently we don't). I think the recommended way is: For public headers, we always include by its absolute path (which means relative path from Taichi project's root). For private headers, we can go with relative path from it's target's include folder. As long as it's guaranteed by a unique path per header file.
Discussion: We should later distinguish between public headers vs private headers (currently we don't). I think the recommended way is: For public headers, we always include by its absolute path (which means relative path from Taichi project's root). For private headers, we can go with relative path from it's target's include folder. As long as it's guaranteed by a unique path per header file.
Agree. It's common that AOT glue codes mistakenly refer to unexported functions leading to linking problems.
Note that right now also everything is grouped in a two-level namespace: ::taichi::lang
. We should also consider just using taichi
, and only the components that are truly language-related goes to taichi::lang
, e.g. CHI IR. For backend stuff, it could be something like taichi::codegen
, taichi::runtime
, etc.
Update:
- API Changes implemented as proposed. Include file globs, directories APIs, scope specifier etc.
- Newly defined targets include
rhi
,codegen
,program_impls
,runtime
. The current dependency relationship is:
Next steps:
- [x] Clean up header dependencies among
rhi
,codegen
, andruntime
. - [x] Define common utilities targets to break from
core
source files. - [x] Isolate Pybind source files
- [ ] Split IR from
core
files
Current dependency graph
Ideally TaichiCore.cmake
should only deal with taichi_core
related build, which means decoupled from language frontend such as Python and others. @AmesingFlank What is the status of TI_EMSCRIPTENED
. WDYT if we rename this to taichi_javascript
? and maybe move this part into a separate TaichiJavascript.cmake
?
https://github.com/taichi-dev/taichi/blob/4bc6f0cb4f6d8078ebc6badb29d98da6617d4436/cmake/TaichiCore.cmake#L530
@qiao-bo
For taichi.js, I have decided to move away from using emscripten to compile taichi, because of binary size and performance issues. Instead, I have re-implemented the functionalities that I require in Javascript. So I think for now it would be a good idea to remove all TI_EMSCRIPTENED
related code from the C++ codebase.
@qiao-bo For taichi.js, I have decided to move away from using emscripten to compile taichi, because of binary size and performance issues. Instead, I have re-implemented the functionalities that I require in Javascript. So I think for now it would be a good idea to remove all
TI_EMSCRIPTENED
related code from the C++ codebase.
OK, that makes it cleaner, thanks for the info.
Update: Current dependency graph among targets extracted from CMake.
Update together with @ailzhang: In addition to taichi_core
(which can be further split into ir
, analysis
, etc.), we can split an artifact
target from current runtime
targets. This information is required by both codegen
and runtime
. One example in the context of AOT is KernelAttributes:
https://github.com/taichi-dev/taichi/blob/5b890f90f9edb90c2f1b4841ea7142f4b13e0bb2/taichi/backends/metal/kernel_utils.h#L105-L156
- [x] move this information from codegen to runtime such that runtime does not depend on codegen.
- [ ] split
artifact
target!
FYI @PGZXB #5889 introduced an extra dependency from gfx_runtime
to spirv_codegen
which is unexpected.
You can reproduce this by TAICHI_CMAKE_ARGS="-DTI_WITH_VULKAN:BOOL=ON -DTI_WITH_OPENGL:BOOL=OFF " python setup.py develop
and then import taichi
. Specifically it was introduced by https://github.com/taichi-dev/taichi/pull/5889/files#diff-f99a0ddc44f29052309b4ae1983406e7c442e8f44b683937edd532a1e70f2269R3. It can be worked arounded by linking spirv_codegen
to gfx_runtime
target but that contradicts with what we want. Ideally if cache_manager has to depend on both gfx_runtime and spirv_codegen it can be a separate target. Wdyt?