max [BUG]: Max Graph can't concat two custom ops

[BUG]: Max Graph can't concat two custom ops

Open fefespn opened this issue 1 year ago • 1 comments

Bug description

I built a very simple graph, with 2 custom ops. the input is one simple float32 with shape 1X8. the output is float32 with shape 2X8. I have 2 custom ops, one called gelu_1 the second called gelu_2. the graph is just calculating two gelus and concats and return them. I get this error:

mojo concat_graph.mojo %2 = "rmo.concat"(%0, %0) {axis = 0 : i64, outputParamDecls = #kgen<param.decls[]>} : (!mo.tensor<[1, 8], f32>, !mo.tensor<[1, 8], f32>) -> !mo.tensor<[2, 8], f32> %3 = "rmo.concat"(%1, %1) {axis = 0 : i64, outputParamDecls = #kgen<param.decls[]>} : (!mo.tensor<[1, 8], f32>, !mo.tensor<[1, 8], f32>) -> !mo.tensor<[2, 8], f32> %4 = "rmo.concat"(%0, %1) {axis = 0 : i64, outputParamDecls = #kgen<param.decls[]>} : (!mo.tensor<[1, 8], f32>, !mo.tensor<[1, 8], f32>) -> !mo.tensor<[2, 8], f32> Please submit a bug report to https://github.com/modularml/mojo/issues and include the crash backtrace along with all the relevant source codes. Stack dump: 0. Program arguments: mojo concat_graph.mojo Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var LLVM_SYMBOLIZER_PATH to point to it): 0 mojo 0x000000010049cc04 llvm_strlcpy + 51480 1 mojo 0x000000010049aef0 llvm_strlcpy + 44036 2 mojo 0x000000010049d2a4 llvm_strlcpy + 53176 3 libsystem_platform.dylib 0x000000018dbe5a24 _sigtramp + 56 4 libmof.dylib 0x0000000139cc458c __jit_debug_register_code + 4466816 5 libmof.dylib 0x0000000139cc4190 __jit_debug_register_code + 4465796 6 libmof.dylib 0x0000000138485270 mlirTypeIDAllocatorAllocateTypeID + 3531068 7 libmof.dylib 0x0000000138489188 mlirTypeIDAllocatorAllocateTypeID + 3547220 8 libmof.dylib 0x0000000138488138 mlirTypeIDAllocatorAllocateTypeID + 3543044 9 libmof.dylib 0x0000000138489c74 mlirTypeIDAllocatorAllocateTypeID + 3550016 10 libmof.dylib 0x0000000138495174 mlirTypeIDAllocatorAllocateTypeID + 3596352 11 libmof.dylib 0x00000001384a767c mlirTypeIDAllocatorAllocateTypeID + 3671368 12 libmof.dylib 0x0000000138125dc0 mlirSymbolTableWalkSymbolTables + 4588 13 libmof.dylib 0x00000001384a61f0 mlirTypeIDAllocatorAllocateTypeID + 3666108 14 libmof.dylib 0x00000001384a4fa0 mlirTypeIDAllocatorAllocateTypeID + 3661420 15 libmof.dylib 0x0000000138446b90 mlirTypeIDAllocatorAllocateTypeID + 3275356 16 libmof.dylib 0x000000013a85bac8 __jit_debug_register_code + 16620988 17 libmof.dylib 0x000000013a85c500 __jit_debug_register_code + 16623604 18 libmof.dylib 0x000000013a85f1dc __jit_debug_register_code + 16635088 19 libmof.dylib 0x0000000138023318 M_openPath + 114380 20 libmof.dylib 0x0000000138005640 M_compileToBinaryFromSource + 556 21 libmodular-framework-common.dylib 0x0000000108fe0a14 M_deleteDouble_Histogram + 65800 22 libmodular-framework-common.dylib 0x0000000108fb94b8 23 libmodular-framework-common.dylib 0x0000000108fc0fb4 24 libmodular-framework-common.dylib 0x0000000108fc8854 M_compileModelSync + 192 25 libmodular-framework-common.dylib 0x000000028006ff54 M_compileModelSync + 6292142016 26 mojo 0x0000000100831510 __jit_debug_register_code + 1041480 27 mojo 0x00000001003fd54c 28 mojo 0x00000001003fcf40 29 mojo 0x00000001003e5940 30 dyld 0x000000018d8350e0 start + 2360 mojo crashed! Please file a bug report. [20417:4465035:20240903,203202.357894:WARNING crash_report_exception_handler.cc:257] UniversalExceptionRaise: (os/kern) failure (5) zsh: segmentation fault mojo concat_graph.mojo

the thing is very weird: I have this code:

def construct_graph() -> Graph: graph = Graph( in_types=List[Type](TensorType(DType.float32, 1, 8)), out_types=List[Type](TensorType(DType.float32, 2, 8)), ) gelu_1 = ops.custom["my_gelu_1"](ListSymbol, graph[0].type()) gelu_2 = ops.custom["my_gelu_2"](ListSymbol, graph[0].type()) concat_calc_1 = ops.concat(List[Symbol](gelu_1, gelu_1)) concat_calc_2 = ops.concat(List[Symbol](gelu_2, gelu_2)) concat_calc_3 = ops.concat(List[Symbol](gelu_1, gelu_2)) print(concat_calc_1) print(concat_calc_2) print(concat_calc_3) #graph.output(concat_calc_1) #graph.output(concat_calc_2) graph.output(concat_calc_3) graph.verify() return graph

if I do graph.output(concat_calc_3), I get the error. If I do graph.output(concat_calc_1) or graph.output(concat_calc_2), I don't get the problem.

So i think the problem is only when using two different custom ops for the same output.

Steps to reproduce

Include relevant code snippet or link to code that did not work as expected.
If applicable, add screenshots to help explain the problem.
Include anything else that might help us debug the issue.

The main code is:

from max.graph import Graph, TensorType, Type, ops, Symbol from max import engine from max.engine import InferenceSession from tensor import Tensor, TensorShape from pathlib import Path def construct_graph() -> Graph: graph = Graph( in_types=List[Type](TensorType(DType.float32, 1, 8)), out_types=List[Type](TensorType(DType.float32, 2, 8)), ) gelu_1 = ops.custom["my_gelu_1"](ListSymbol, graph[0].type()) gelu_2 = ops.custom["my_gelu_2"](ListSymbol, graph[0].type()) concat_calc_1 = ops.concat(List[Symbol](gelu_1, gelu_1)) concat_calc_2 = ops.concat(List[Symbol](gelu_2, gelu_2)) concat_calc_3 = ops.concat(List[Symbol](gelu_1, gelu_2)) print(concat_calc_1) print(concat_calc_2) print(concat_calc_3) #graph.output(concat_calc_1) #graph.output(concat_calc_2) graph.output(concat_calc_3) graph.verify() return graph def main(): session = InferenceSession() model = session.load( construct_graph(), custom_ops_paths=Path("custom_ops.mojopkg"), ) input_ = Tensor[DType.float32].randn((1, 8)) results = model.execute("input0", input_^) output = results.getDType.bool print(output)

The custom ops is: from max.extensibility import Tensor, empty_tensor from max import register from math import erf, sqrt

@register.op("my_gelu_1") fn gelu_1[type: DType, rank: Int](x: Tensor[type, rank]) -> Tensor[type, rank]: var output = empty_tensortype

@always_inline @parameter fn func[width: Int](i: StaticIntTuple[rank]) -> SIMD[type, width]: var tmp = x.simd_loadwidth return tmp / 2 * (1 + erf(tmp / sqrt(2)))

print("Hello, custom GELU!") output.for_eachfunc return output^ @register.op("my_gelu_2") fn gelu_2[type: DType, rank: Int](x: Tensor[type, rank]) -> Tensor[type, rank]: var output = empty_tensortype

@always_inline @parameter fn func[width: Int](i: StaticIntTuple[rank]) -> SIMD[type, width]: var tmp = x.simd_loadwidth return tmp / 2 * (1 + erf(tmp / sqrt(2)))

print("Hello, custom GELU!") output.for_eachfunc return output^

System information

- What OS did you do install MAX on ? Mac 
- Provide version information for MAX by pasting the output of max -v`  max 24.4.0 (59977802)
- Provide version information for Mojo by pasting the output of mojo -v` mojo 24.4.0 (59977802)
- Provide Modular CLI version by pasting the output of `modular -v` modular 0.9.2 (b3079bd5)

Sep 04 '24 07:09 fefespn

max max copied to clipboard

[BUG]: Max Graph can't concat two custom ops

Bug description

Steps to reproduce

System information

max
max copied to clipboard