yuhao
yuhao
i used fontforge to delete the reflection of "fl" and "fi", and it works well on Win10.
Nginx seems to fail handling symbolic link even the disymlink is off on ci machine, maybe caused by the mount file system.
ExtractKernelLaunchTensor Pass的定义: 在wrap-kernel-launch pass之后的ir是通过纯tensor流传递数据,该pass将!okl.launcher_ctx引入数据流,成为okl抽象层面上数据流的实际管理者。 引入!okl.get_tensor_from_ctx op产生不同用途的tensor,相应的用途通过!okl.tensortypeEnumAttr进行标定,以保留更多的抽象信息。
WrapOpsToKernelLaunchPass定义: 将oneflow::job内用于计算的连续ops打包成一个func,然后将func汇编塞入oneflow.kernel_launch的单个op里面,并妥善处理连续op的tensor流与该func args return的问题。 比如: job(0){ 1 = oneflow.relu(0) 2 = oneflow.relu(1) return 2 } 转换成 job(0){ x = oneflow.kernel_launch(0) @{"1=oneflow.relu(0) 2=onefow.relu(1) return 2"}(tensor -> tensor) return 2 }
RoundTrip 第二个Pass最后生成: ``` module { oneflow.job @GraphToRun_0(%arg0: tensor) -> tensor { %output = "oneflow.input"(%arg0) {data_type = 2 : i32, device_name = ["@0:0"], device_tag = "cpu", hierarchy = [1], is_dynamic =...
``` module { func.func @_mlir__mlir_ciface_okl_func(%arg0: !okl.launcher_ctx) attributes {compiled = "true"} { %0 = "okl.build_reg_ctx"() ({ ^bb0(%arg1: tensor): %6 = "oneflow.relu"(%arg1) {device_name = ["@0:0"], device_tag = "cpu", hierarchy = [1], op_name...
// -----// IR Dump After ExtractKernelLaunchTensorPass //----- // ``` module { func.func @wrap0(%arg0: !okl.launcher_ctx) -> (tensor, tensor) { %0 = "okl.get_tensor_from_arg"(%arg0) {tensor_type = 0 : i32} : (!okl.launcher_ctx) -> tensor...
设置 ONEFLOW_MLIR_FUSE_KERNEL_LAUNCH = 1,启动oneflow kernel launch功能打包计算型op。 在roundtrip的结尾,用于计算的连续的op被合并成单个kernel launch op oneflow ops -> oneflow.kernel_launch{mlir_assembly="wrap"} 例如 ```mlir module { oneflow.job @GraphToRun_1(%arg0: tensor) -> tensor { %output = "oneflow.input"(%arg0) {data_type = 2 :...
tks a lot for any suggestions
so, what is the solution?