AITemplate icon indicating copy to clipboard operation
AITemplate copied to clipboard

Is there C++ API provided?

Open wanghuihhh opened this issue 3 years ago • 5 comments

Thank for your project! When we want to deploy my model in c++ project, is there C++ API provided to deploy my model? We don't find any c++ api to use. if you could provide c++ api, we will appreciate it.

wanghuihhh avatar Oct 09 '22 03:10 wanghuihhh

Yes Python runtime is just a simple wrapper of C++ runtime. You can use the generated model with C API directly. This note is helpful: https://github.com/facebookincubator/AITemplate/tree/main/static

And the Model class is showing how to use C API in python, which can take a reference of usage in C++:

https://github.com/facebookincubator/AITemplate/blob/main/python/aitemplate/compiler/model.py

antinucleon avatar Oct 09 '22 03:10 antinucleon

Thank you! I try it and I find there is a large difference in using between python runtime and c++ runtime. If you could provided c++ runtime example, we will appreciate it.

Besides, when I want to test multi-instance model, I change the code in "AITemplate/examples/01_resnet-50/benchmark_ait.py", but get error. The place I changed: Line 76: mod = Model(os.path.join("./tmp", model_name, "test.so"),2). # add two runtime Line 89&97: "add num_threads=2, use_unique_stream_per_thread = True" I meet the error:

[./tmp/resnet50_64/conv2d_bias_relu_373.cu] Got cutlass error: Error Internal at: 154 [16:56:40] ./tmp/resnet50_64/model_interface.cu:158: Error: GraphInstantiate(&graph_exec, graph) API call failed: operation not permitted when stream is capturing at ./tmp/resnet50_64/model-generated.h, line3009 Traceback (most recent call last): File "benchmark_ait_test.py", line 136, in main() File "/home/admin/.local/lib/python3.8/site-packages/click/core.py", line 1130, in call return self.main(*args, **kwargs) File "/home/admin/.local/lib/python3.8/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/home/admin/.local/lib/python3.8/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/admin/.local/lib/python3.8/site-packages/click/core.py", line 760, in invoke return __callback(*args, **kwargs) File "benchmark_ait_test.py", line 132, in main benchmark("resnet50", batch_size, graph_mode=use_graph) File "benchmark_ait_test.py", line 89, in benchmark t, _, __ = mod.benchmark_with_tensors( File "/home/admin/.local/lib/python3.8/site-packages/aitemplate/compiler/model.py", line 610, in benchmark_with_tensors mean, std, ait_outputs = self.benchmark( File "/home/admin/.local/lib/python3.8/site-packages/aitemplate/compiler/model.py", line 566, in benchmark self.DLL.AITemplateModelContainerBenchmark( File "/home/admin/.local/lib/python3.8/site-packages/aitemplate/compiler/model.py", line 192, in _wrapped_func raise RuntimeError(f"Error in function: {method.name}") RuntimeError: Error in function: AITemplateModelContainerBenchmark

How shell we do to test multi-instance model?

wanghuihhh avatar Oct 10 '22 09:10 wanghuihhh

cc @mikeiovine

antinucleon avatar Oct 11 '22 08:10 antinucleon

This looks like a known issue with graph mode that should be fixed in the next release. For now I recommend setting graph_mode=False to test this case.

mikeiovine avatar Oct 12 '22 18:10 mikeiovine

Hey @wanghuihhh, so I thought this was a separate issue, but it looks like something else entirely. The problem is that we were starting our stream captures with cudaStreamCaptureModeGlobal instead of cudaStreamCaptureModeThreadLocal, causing issues in the multi-threaded tests. I'll have a patch out to fix this particular issue soon.

As for the C++ runtime docs: I'll add a proper example. Thanks for the suggestion!

cc @antinucleon

mikeiovine avatar Oct 13 '22 19:10 mikeiovine