yuhao comments

Results 12 comments of


                                            yuhao

why this font in windows shows "fi" to tel icon

i used fontforge to delete the reflection of "fl" and "fi", and it works well on Win10.

support ci tag: pr sym link

Nginx seems to fail handling symbolic link even the disymlink is off on ci machine, maybe caused by the mount file system.

ExtractKernelLaunchTensor Pass的定义：在wrap-kernel-launch pass之后的ir是通过纯tensor流传递数据，该pass将!okl.launcher_ctx引入数据流，成为okl抽象层面上数据流的实际管理者。引入!okl.get_tensor_from_ctx op产生不同用途的tensor，相应的用途通过!okl.tensortypeEnumAttr进行标定，以保留更多的抽象信息。

support okl dialect

WrapOpsToKernelLaunchPass定义：将oneflow::job内用于计算的连续ops打包成一个func，然后将func汇编塞入oneflow.kernel_launch的单个op里面，并妥善处理连续op的tensor流与该func args return的问题。比如： job(0){ 1 = oneflow.relu(0) 2 = oneflow.relu(1) return 2 } 转换成 job(0){ x = oneflow.kernel_launch(0) @{"1=oneflow.relu(0) 2=onefow.relu(1) return 2"}(tensor -> tensor) return 2 }

support okl dialect

RoundTrip 第二个Pass最后生成： ``` module { oneflow.job @GraphToRun_0(%arg0: tensor) -> tensor { %output = "oneflow.input"(%arg0) {data_type = 2 : i32, device_name = ["@0:0"], device_tag = "cpu", hierarchy = [1], is_dynamic =...

support okl dialect

``` module { func.func @_mlir__mlir_ciface_okl_func(%arg0: !okl.launcher_ctx) attributes {compiled = "true"} { %0 = "okl.build_reg_ctx"() ({ ^bb0(%arg1: tensor): %6 = "oneflow.relu"(%arg1) {device_name = ["@0:0"], device_tag = "cpu", hierarchy = [1], op_name...

support okl dialect

// -----// IR Dump After ExtractKernelLaunchTensorPass //----- // ``` module { func.func @wrap0(%arg0: !okl.launcher_ctx) -> (tensor, tensor) { %0 = "okl.get_tensor_from_arg"(%arg0) {tensor_type = 0 : i32} : (!okl.launcher_ctx) -> tensor...

support okl dialect

设置 ONEFLOW_MLIR_FUSE_KERNEL_LAUNCH = 1，启动oneflow kernel launch功能打包计算型op。在roundtrip的结尾，用于计算的连续的op被合并成单个kernel launch op oneflow ops -> oneflow.kernel_launch{mlir_assembly="wrap"} 例如 ```mlir module { oneflow.job @GraphToRun_1(%arg0: tensor) -> tensor { %output = "oneflow.input"(%arg0) {data_type = 2 :...

it seems that the gpu memory could not be free between iterations or rounds in benchmark,

tks a lot for any suggestions

Cannot jump to Microsoft packages by using coc-definition

so, what is the solution?