NoteZ icon indicating copy to clipboard operation
NoteZ copied to clipboard

Xcode clang 手撕日记

Open jmpews opened this issue 6 years ago • 0 comments

Prologue

最近在写一些 MachineFunctionPass 相关的项目, 一般来说很少会有机会切到 clang 的后端的特定 Target 的细节. 本文会介绍一些 Xcode clang 与 LLVM opensource clang 的区别.

前置知识

1. LLVM Pass 编译方式

Xcode 并不能像 LLVM clang 一样直接 load 插件, 所以只能通过一些 trick 的方式将 pass 注入到 Xcode, 这里研究了一下, 总结了 3 个方法.

1.1. 基于 liblLVM.dylib 编译

target_link_libraries(XxTrampoline LLVM hookzz findsymbol)

适用场景:

  1. Xcode clang
  2. DYLD_FORCE_FLAT_NAMESPACE=1 DYLD_INSERT_LIBRARIES=/Users/jmpews/Desktop/Alibaba/AliLLVM/XxTrampoline/xcode_build/Debug/Demo.dylib 通过环境变量注入 pass.

1.2. 基于 -undefined dynamic_lookup 编译

set (CMAKE_SHARED_LINKER_FLAGS "-undefined dynamic_lookup")
target_link_libraries(XxTrampoline hookzz findsymbol)

适用场景:

  1. LLVM opensource clang
  2. DYLD_FORCE_FLAT_NAMESPACE=1 DYLD_INSERT_LIBRARIES=/Users/jmpews/Desktop/Alibaba/AliLLVM/XxTrampoline/xcode_build/Debug/Demo.dylib 通过环境变量注入 pass.

1.3. 基于 libLLVMCore.a

set (CMAKE_SHARED_LINKER_FLAGS "-undefined dynamic_lookup")

set(LLVM_LINK_COMPONENTS
  ${LLVM_TARGETS_TO_BUILD}
)
llvm_map_components_to_libnames(_llvm_libs
  Core
)
llvm_expand_dependencies(llvm_libs ${_llvm_libs})

target_link_libraries(XxTrampoline hookzz findsymbol ${_llvm_libs})
  1. Xcode clang
  2. 静态二进制 patch 注入, 即插入 load_command 方式

Xcode clang 与 LLVM opensource clang

因为 LLVM opensource clang 是开源的所以 pass 是依赖于 LLVM 进行编译的, 但是编译的 pass 是否可以直接用于 Xcode clang 呢? 在测试中发现, 基本上 IR 层的 pass 没有发现什么不同, 但是在 MachineFunctionPass 层即特定的后端上, 会存在不同点.

Xcode clang 调试

1.1. 关闭 SIP

Recovery 模式下

csrutil disable

1.2. 生成可调式参数

[0] % clang -shared -isysroot `xcrun --sdk iphoneos --show-sdk-path` -arch arm64   /Users/jmpews/project/llvm-7.0/llvm_pass_demo/tests/test_simple.cc -o test_out -###
Apple LLVM version 10.0.0 (clang-1000.11.45.2)
Target: aarch64-apple-darwin18.0.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
 "/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang" "-cc1" "-triple" "arm64-apple-ios12.0.0" "-Wdeprecated-objc-isa-usage" "-Werror=deprecated-objc-isa-usage" "-Werror=implicit-function-declaration" "-emit-obj" "-mrelax-all" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "test_simple.cc" "-mrelocation-model" "pic" "-pic-level" "2" "-mthread-model" "posix" "-mdisable-fp-elim" "-fno-strict-return" "-masm-verbose" "-munwind-tables" "-target-cpu" "cyclone" "-target-feature" "+fp-armv8" "-target-feature" "+neon" "-target-feature" "+crypto" "-target-feature" "+zcm" "-target-feature" "+zcz" "-target-abi" "darwinpcs" "-fallow-half-arguments-and-returns" "-dwarf-column-info" "-debugger-tuning=lldb" "-target-linker-version" "409.12" "-resource-dir" "/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/10.0.0" "-isysroot" "/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS12.0.sdk" "-stdlib=libc++" "-fdeprecated-macro" "-fdebug-compilation-dir" "/Users/jmpews/project/llvm-7.0/llvm_pass_demo/tests" "-ferror-limit" "19" "-fmessage-length" "204" "-stack-protector" "1" "-fblocks" "-fencode-extended-block-signature" "-fobjc-runtime=ios-12.0.0" "-fcxx-exceptions" "-fexceptions" "-fmax-type-align=16" "-fdiagnostics-show-option" "-fcolor-diagnostics" "-o" "/var/folders/cs/9tbx0zs57lb60rbw9dh5zqk00000gn/T/test_simple-0f6121.o" "-x" "c++" "/Users/jmpews/project/llvm-7.0/llvm_pass_demo/tests/test_simple.cc"
 "/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld" "-demangle" "-lto_library" "/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/libLTO.dylib" "-no_deduplicate" "-dynamic" "-dylib" "-arch" "arm64" "-iphoneos_version_min" "12.0.0" "-syslibroot" "/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS12.0.sdk" "-o" "test_out" "/var/folders/cs/9tbx0zs57lb60rbw9dh5zqk00000gn/T/test_simple-0f6121.o" "-lSystem" "/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/10.0.0/lib/darwin/libclang_rt.ios.a"

1. 指令 Opcode 实现不同

在测试过程中发现, 对于 LDRXui Opcode 判断错误, opcode table 是由 llvm-tblgen 生成的 AArch64/AArch64GenInstrInfo.inc.

可以通过 ninja lib/Target/AArch64/all 快速编译出 AArch64 Target.

  void transformInstruction(MachineInstr &MI) {

#ifdef DEBUG
    dbgs() << "AArch64 Opcode: " << MI.getOpcode() << ", MI: " << MI << "\n";
#endif

  ...

这里直接 dump 下 Xcode 中所有的 Opcode 以及对应的 OpcdeName

const TargetInstrInfo *TII;
TII = MF.getSubtarget().getInstrInfo();

#ifdef DUMP_XCODE_AARCH64_OPCODE
    for (int i = 0; i < 3000; i++) {
      dbgs() << i << " : " << TII->getName(i) << "\n";
    }
#endif

img

img

通过上图对比发现, Xcode 在实现 AArch64 Target 的 opcode 上多了一些 PAC 的指令. PAC 是 ARMv8.3 的 feature, 细节也可参考 https://www.qualcomm.com/media/documents/files/whitepaper-pointer-authentication-on-armv8-3.pdf

2. Xcode clang 是优化过的 ? !

这一段 IR Function 是下文需要使用到的, 将在此基础上给出一些对比.

; Function Attrs: noinline nounwind optnone ssp uwtable
define i32 @_Z3addii(i32, i32) #0 {
  %3 = load i8*, i8** getelementptr inbounds ([1 x i8*], [1 x i8*]* @TrampolineTable, i32 0, i32 0)
  indirectbr i8* %3, []
}

这里有一个断点技巧, 对于希望查看每个 pass 在 runOnMachineFunction 处理的结果, 可以通过如下断点实现.

1. breakpoint at `runOnMachineFunction`

2. add condition breakpoint `!(int)strcmp(((llvm::MachineFunction *)$rsi)->getName().str().c_str(),  "_Z3addii")`

如下是 Xcode clang 在 AArch64DAGToDAGISel && SelectionDAGISel 生成的 MachineFunction, 以及对应的 assembly code.

# Machine code for function _Z3addii: IsSSA, TracksLiveness
Function Live Ins: %w0 in %0, %w1 in %2

%bb.0: derived from LLVM BB %2
    Live Ins: %w0 %w1
	%2:gpr32 = COPY %w1; GPR32:%2
	%0:gpr32 = COPY %w0; GPR32:%0
	%1:gpr32 = COPY killed %0; GPR32:%1,%0
	%3:gpr32 = COPY killed %2; GPR32:%3,%2
	%5:gpr64common = ADRP target-flags(aarch64-page) @TrampolineTable; GPR64common:%5
	%6:gpr64sp = ADDXri %5, target-flags(aarch64-pageoff, aarch64-nc) @TrampolineTable, 0; GPR64sp:%6 GPR64common:%5
	%7:gpr64 = LDRXui %6, 0; mem:LD8[getelementptr inbounds ([1 x i8*], [1 x i8*]* @TrampolineTable, i32 0, i32 0)] GPR64:%7 GPR64sp:%6
	BR %7; GPR64:%7

image-20190227155035651

可以看到虽然 w0, w1 是 add 需要使用参数, 在 SelectionDAGISel::SelectAllBasicBlocks -> FastISel::lowerArguments 属于 LiveIn Register, 整个 trampoline 并没有使用到 w0, w1, 但是依然做了保存.

如下是 LLVM opensource clang 生成的 MachineFunction 以及对应的 assembly code

# Machine code for function _Z3addii: IsSSA, TracksLiveness, Legalized, RegBankSelected, Selected

%bb.1: derived from LLVM BB %entry
    Live Ins: %w0 %w1
	%4:gpr64 = MOVaddr target-flags(aarch64-page) @TrampolineTable, target-flags(aarch64-pageoff, aarch64-nc) @TrampolineTable; GPR64:%4
	%3:gpr64sp = COPY %4; GPR64sp:%3 GPR64:%4
	%2:gpr64 = LDRXui %3, 0; mem:LD8[getelementptr inbounds ([1 x i8*], [1 x i8*]* @TrampolineTable, i32 0, i32 0)] GPR64:%2 GPR64sp:%3
	BR %2; GPR64:%2

# End machine code for function _Z3addii.

image-20190227161019139

可以看到, 虽然 w0, w1 也有 LiveIn 标记, 但是在后续的优化中给去除了.

通过对比发现, LLVM opensource clang 在 SelectionDAGISel::SelectAllBasicBlocks -> FastISel::lowerArguments 阶段也将 w0, w1 标记为 LiveIn Register. 但是经过 InstructionSelect MachineFunctionPass 后, 将 dead MI 给 erase 了.

如下 pass 的主要作用是根据 MRI 中的寄存器使用信息遍历 MI 是否是 dead MI, 如果是则将该 MI 从 MF 给 erase.

    bool ReachedBegin = false;
    for (auto MII = std::prev(MBB->end()), Begin = MBB->begin();
         !ReachedBegin;) {
#ifndef NDEBUG
      // Keep track of the insertion range for debug printing.
      const auto AfterIt = std::next(MII);
#endif
      // Select this instruction.
      MachineInstr &MI = *MII;

      // And have our iterator point to the next instruction, if there is one.
      if (MII == Begin)
        ReachedBegin = true;
      else
        --MII;

      DEBUG(dbgs() << "Selecting: \n  " << MI);

      // We could have folded this instruction away already, making it dead.
      // If so, erase it.
      if (isTriviallyDead(MI, MRI)) {
        DEBUG(dbgs() << "Is dead; erasing.\n");
        MI.eraseFromParentAndMarkDBGValuesForRemoval();
        continue;
      }
      ...

这里调整下 InstructionSelect 结合注入的 pass, 给 Xcode clang 加上, 可以看到明显对 dead MI erase 了.

# Machine code for function _Z3addii: IsSSA, TracksLiveness
Function Live Ins: %w0 in %0, %w1 in %2

%bb.0: derived from LLVM BB %2
    Live Ins: %w0 %w1
	%5:gpr64common = ADRP target-flags(aarch64-page) @TrampolineTable; GPR64common:%5
	%6:gpr64sp = ADDXri %5, target-flags(aarch64-pageoff, aarch64-nc) @TrampolineTable, 0; GPR64sp:%6 GPR64common:%5
	%7:gpr64 = LDRXui %6, 0; mem:LD8[getelementptr inbounds ([1 x i8*], [1 x i8*]* @TrampolineTable, i32 0, i32 0)] GPR64:%7 GPR64sp:%6
	BR %7; GPR64:%7

# End machine code for function _Z3addii.

3. Xcode FunctionPass 数量

Xcode clang 中注册的 FunctionPass 数量与 LLVM opensource clang 注册的 FunctionPass 数量. (但是这并不能客观上说明什么

clang`llvm::FPPassManager::runOnFunction:
->  0x100357bf0 <+0>:    pushq  %rbp

(lldb) p/d ((llvm::FPPassManager *)$rdi)->getNumContainedPasses()
(unsigned int) $2 = 29

LLVM opensource clang 中注册的 FunctionPass

(lldb) p/d getNumContainedPasses()
(unsigned int) $7 = 39

4. 抛砖引玉

jmpews avatar Feb 27 '19 10:02 jmpews