autoware.universe
autoware.universe copied to clipboard
Run Lidar Centerpoint with TVM
Checklist
- [X] I've read the contribution guidelines.
- [X] I've searched other issues and no duplicate issues were found.
- [X] I've agreed with the maintainers that I can plan this task.
Description
There are packages in Universe that use TensorRT as neural network library. We would like to document a way to convert TensorRT based model into TVM model. We will start investigation using lidar_centerpoint package first. https://github.com/autowarefoundation/autoware.universe/tree/main/perception/lidar_centerpoint
Purpose
Investigate a way to use TVM.
Possible approaches
TBD
Definition of done
- [ ] Run lidar_centerpoint using TVM library
- [ ] Write the document for the procedure to convert TensorRT based module to TVM based module
Here's a summary of what I learned about TVM (thanks @ambroise-arm for your help, please let me know if there's anything incorrect):
- TVM aims to be backend agnostic when developing, but once it gets deployed it targets the desired backend
- The workflow is as follows:
- Parse the ONNX files via ARM's
tvm_cli.py
tool (https://github.com/autowarefoundation/modelzoo/blob/master/scripts/tvm_cli/tvm_cli.py), which will tune the model using AutoTVM. - The script will generate several files:
- A network module in the form of a shared library (
deploy_lib.so
) - A JSON file describing the network graph (
deploy_graph.json
) - A file containing the parameters of the network (
deploy_param.params
) - A header file with the configuration of the inference engine (
inference_engine_tvm_config.hpp
)
- A network module in the form of a shared library (
- Parse the ONNX files via ARM's
- In the C++ load the
deploy_lib.so
via TVM'sdload
mechanism, this enables the code to be platform agnostic while being performant at the same time - TVM allows developers to use backend-specific code (e.g. CUDA), though the goal is to abstract that entirely
Notes
- The TVM team is working on their own CLI tool called TVMC (https://tvm.apache.org/docs/tutorial/tvmc_command_line_driver.html), which would supersede
tvm_cli.py
- ARM ported Autoware.AI's lidar_point_pillars package from TensorRT to TVM, this can be used as a reference on how to migrate to TVM. It currently contains backend-specific code through CUDA. Additionally, both TensorRT and TVM are supported and one is chosen at build time (TVM is the default).
@Sharrrrk knows a candidate to work on this. @Sharrrrk, Can you tell him/her to follow up on this issue?
I have some problems when compiling centerpoint with TVM, both for "llvm" backend and "cuda" backend. I'm not sure if the problem is caused by TVM or improper configuration file settings. Taking "pts_backbone_neck_head_default.onnx" model as example, describe the setting and problems follows.
-
the "definitation.yaml" of centerpoint
-
the problems encountered a. for cup backend:
b. for cuda backend:
Looking forward to your advice
Your configuration file looks correct. I think the issue comes from TVM not having good support for dynamic parameters.
I'm not sure what is the exact effect of specifying "-1" as shape parameter. I've tried to let TVM handle the shape automatically by making the following change in ModelZoo and then building the docker image locally:
diff --git a/scripts/tvm_cli/tvm_cli.py b/scripts/tvm_cli/tvm_cli.py
index 7d5fa54..fec132a 100755
--- a/scripts/tvm_cli/tvm_cli.py
+++ b/scripts/tvm_cli/tvm_cli.py
@@ -75 +75 @@ def get_network(info):
- mod, params = relay.frontend.from_onnx(onnx_model, info['input_dict'])
+ mod, params = relay.frontend.from_onnx(onnx_model)
Testing for llvm the error becomes RuntimeError: Invalid type of axis: <class 'tvm.tir.expr.Add'>
, which gets referenced here: https://discuss.tvm.apache.org/t/how-could-us-use-tvm-relay-transform-tomixedprecision/10465/14
And testing for cuda I get RuntimeError: cuda winograd conv2d doesn't support dynamic input height or width.
.
Your configuration file looks correct. I think the issue comes from TVM not having good support for dynamic parameters.
I'm not sure what is the exact effect of specifying "-1" as shape parameter. I've tried to let TVM handle the shape automatically by making the following change in ModelZoo and then building the docker image locally:
diff --git a/scripts/tvm_cli/tvm_cli.py b/scripts/tvm_cli/tvm_cli.py index 7d5fa54..fec132a 100755 --- a/scripts/tvm_cli/tvm_cli.py +++ b/scripts/tvm_cli/tvm_cli.py @@ -75 +75 @@ def get_network(info): - mod, params = relay.frontend.from_onnx(onnx_model, info['input_dict']) + mod, params = relay.frontend.from_onnx(onnx_model)
Testing for llvm the error becomes
RuntimeError: Invalid type of axis: <class 'tvm.tir.expr.Add'>
, which gets referenced here: https://discuss.tvm.apache.org/t/how-could-us-use-tvm-relay-transform-tomixedprecision/10465/14 And testing for cuda I getRuntimeError: cuda winograd conv2d doesn't support dynamic input height or width.
.
Recently I'm trying to use the dev version of tvm (i.e. the main branch of tvm) instead of the stable 0.8 version, which seems to solve the RuntimeError: Invalid type of axis: <class 'tvm.tir.expr.Add'>
error. However the RuntimeError: cuda winograd conv2d doesn't support dynamic input height or width.
error remains.
If we don't have enough engineers who can support TVM packages, then I don't think we can proceed with the task. If there is no progress for another week, then we might want to consider using other ML Inference library. (past discussion about using TVM)
Although it seems we can't compile this model with dynamic parameters with the current state of TVM, I can compile it just fine with fixed parameters. The code for LidarCenterpoint in this repository already hardcodes "batch_size" to 1. And the default values for range_max_*_
, range_min_*_
and voxel_size_*_
give an "H" and "W" value of 560.
By replacing the "-1" with those values in "definition.yaml", the model compiles without errors.
@liuzf1988 I see that you are trying to target CUDA as a backend, but support for CUDA is not enabled in autoware/model-zoo-tvm-cli:latest
. We have autoware/model-zoo-tvm-cli:latest-cuda
for that, but it's not really tested; and even if it compiles we don't have a prebuilt TVM runtime that supports CUDA for the Autoware side. I suggest targeting Vulkan instead if you want GPU acceleration.
@mitsudome-r So good news: I was wrong during the ASWG by saying it was a model issue. If we don't plan on changing the width and height parameters, or if we can keep a small number of possible values, then we can probably go ahead with TVM and fixed parameters.
Although it seems we can't compile this model with dynamic parameters with the current state of TVM, I can compile it just fine with fixed parameters. The code for LidarCenterpoint in this repository already hardcodes "batch_size" to 1. And the default values for
range_max_*_
,range_min_*_
andvoxel_size_*_
give an "H" and "W" value of 560.By replacing the "-1" with those values in "definition.yaml", the model compiles without errors.
@liuzf1988 I see that you are trying to target CUDA as a backend, but support for CUDA is not enabled in
autoware/model-zoo-tvm-cli:latest
. We haveautoware/model-zoo-tvm-cli:latest-cuda
for that, but it's not really tested; and even if it compiles we don't have a prebuilt TVM runtime that supports CUDA for the Autoware side. I suggest targeting Vulkan instead if you want GPU acceleration.
Yes, I totally agree with you to compile LidarCenterpoint model using default value for "batch_size", "H" and "W" at current. And TVM's Relax architecture (i.e. relay next) which supports dynamic shape workloads is under development.
@liuzf1988 @ambroise-arm Thanks for the update. @liuzf1988 Could you try with your proposed approach to use the default values for the model input?
@liuzf1988 Are there any updates on this task?
@liuzf1988 Are there any updates on this task?
@xmfcx So busy these days, sorry for not being able to reply to the message in time. If there is no misunderstanding, we need to write a document for converting raw Lidar Centerpoint model to TVM based modules. Apart from the document, the code for compiling the raw model will also be provided. I'll try to get these done by this weekend.
@ambroise-arm Referring to the official TVM documentation and the existing script (tvm_cli.py) of autoware model zoo, I wrote the independent script for compiling the neuron network model. However, there are some issues that may need to confirm with you:
- Which files need to be uploaded to the cloud
In the original script, i.e.,tvm_cli.py
, the command used to compile the model is: graph, lib, params = relay.build(mod, target=info['target'], params=params)
, However, for the current version of tvm-0.10.dev0, the origin compilation command will be deprecated and modified to: lib=relay.build(mod, target=info['target'], params=params)
So we originally had to upload three output files, "deploy_lib.so", "deploy_param.params", and "deploy_graph.json", but now only the "deploy_lib.so" file need to upload. In addition, where should the compiled module file (deploy_lib.so) be uploaded, should it be uploaded to Amazon S3 Bucket as previous? And compiled module files for which backends need to provided (llvm and vulkan, right) ?
- How to import centerpoint_tvm package to Autoware.universe
I see that you propose a pull request (PR-1181) to port the lidar_apollo_segmentation_tvm packet from Autoware.Auto to Autoware.Universe. Do I need to write a similar package named lidar_centerpoint_tvm to port centerpoint module in the same way? If so, it needs some extra time , and this work is in progress.
@liuzf1988
Thanks for your update.
I wrote the independent script for compiling the neuron network model.
If the current script cannot be used as is, do you think it can be modified to accommodate for the compilation of Lidar Centerpoint? That would be preferable to having standalone scripts per model.
Files
For the distribution of the neural network, the idea would be to use the ModelZoo repository for that. The onnx neural network can be committed to the repository (with the relevant definition.yaml file, and possible modifications to the script to have it compile). That way the network will start to get compiled and uploaded automatically to the S3 bucket as part of the ModelZoo CI (currently for llvm and vulkan backends). (see diagrams of the architecture)
Currently ModelZoo uses TVM 0.8 to compile the models. And the tvm_vendor package that provides the runtime capability to Autoware is also on version 0.8 at the moment. If there is a need to update to TVM 0.9 for Lidar Centerpoint to work, that can be scheduled. But we won't be using a development version, so no 0.10 until it is released.
Package
Yes, I think it is important to have a dedicated package for lidar_centerpoint_tvm
. The tvm_utility package provides a pipeline.hpp
header so that all packages that make use of TVM in Autoware can use a common interface both for downloading the compiled models and for the inference. tvm_utility
has a yolo_v2_tiny test case as an example. And lidar_apollo_segmentation_tvm
acts as an example for a standalone package. I think it would be good for lidar_centerpoint_tvm
to also use the interface provided by tvm_utility
.
That way, when the change in artifacts structure you mentioned with TVM 0.10 happens, it can be taken care of on the Autoware side by only changing tvm_utility
instead of the individual packages that make use of it.
@liuzf1988 we are organising a meeting to discuss a way forward for TVM in Autoware. If you are interested in joining us please confirm your email address. Thank you.
We had a discussion among @ambroise-arm, @xmfcx , @BonoloAWF , @angry-crab last week on this topic. See this comment for the summary.
@liuzf1988 Do you have any progress with this issue? @angry-crab might be able to help you if there are any other blocking items.
@liuzf1988 Hi, it would be great if you could share the definition.yaml
files so that I'll take over the rest. Thanks.
@ambroise-arm Please correct me if I'm wrong. To sum up, based on the discussion above, we need to do the followings:
- fixed input and output size of the network.
- push onnx files along with
definition.yaml
tomodel_zoo
repo. - edit the cli scipt if necessary.
Also, I was wondering since tvm_utility
is merged, is cuda
backend ready to be tested? Thanks.
@angry-crab Yes to the points you listed, plus:
- have a way to run the TVM model (I suggested here having a separate Autoware package for that, but it could be something else, I don't know what makes the most sense for what we want to achieve)
- document the procedure
For the cuda backend, it is possible to compile networks for it with the script in ModelZoo. And using the result of that compilation with tvm_utility
should work. But the automated compilation and release pipeline of the ModelZoo CI only targets llvm and vulkan (although that could be changed if there is a need). So yes, but it will be a manual process.
EDIT: One more thing, the TVM runtime provided by tvm_vendor is not compiled with cuda support. So that is also something that will need to be compiled and installed locally. A very manual process then.
I would recommend using the Vulkan backend for GPU acceleration, unless there a specific need for cuda.
@ambroise-arm
Thanks for the information. I've pushed centerpoint models. CenterPoint
I'll work on the migration to tvm. As for cuda
backend, I'll create another issue after this one is closed.
@liuzf1988 Do you have any progress with this issue? @angry-crab might be able to help you if there are any other blocking items. A ROS2 package named
lidar_centerpoint_tvm
was finished last week, but I encountered some problems during the debugging process, and is working on it this week.
@liuzf1988
Thanks for the update. I had a look into the current centerpoint
implementation. It is using some hand-crafted cuda
kernels for processing intermediate data. Did you convert these kernels into tvm operations
? It would be great if you could share the repo/link so that I could help.
@angry-crab
Yes, it take some time to rewrite the cuda
code (currently it is changed to std::thread
implementation). Can I submit the code to a pull request tomorrow (I need a little time to restore and organize the code)? Then we can together debug and optimize the code.
@liuzf1988 Thanks for the update. I had a look into the current
centerpoint
implementation. It is using some hand-craftedcuda
kernels for processing intermediate data. Did you convert these kernels intotvm operations
? It would be great if you could share the repo/link so that I could help.
@angry-crab
Here is the two definitions for voxel_encoder
and backbone_neck_head
onnx model of centerpoint.
Ideally, cuda kernels
could be generated by tvm
by defining high-level abstraction, such as tvmScript
or tensor Equation
. In this issue, it makes more sense to first convert kernels into c++ code and then migrate to tvm dialects to ensure correctness.
As a reference of using current tvm
utility, apollo_segmentation_tvm also uses c++ for preprocessing and postprocessing. I think tvmScript
implemention should be handled in a different issue after this issue is closed.
Hi everyone, I'm one of developers from the TVM community. I'm very pleased to find that TVM is being integrated to Autoware. Feel free to reach out for any help (I'm active in our discussion forum https://discuss.tvm.apache.org/).
P.S. I've worked on improving the Vulkan backend last year, and I consider it to be one of TVM's unique strengths - Glad to see that it has found a new industrial use case.
Hi everyone, I'm one of developers from the TVM community. I'm very pleased to find that TVM is being integrated to Autoware. Feel free to reach out for any help (I'm active in our discussion forum https://discuss.tvm.apache.org/).
P.S. I've worked on improving the Vulkan backend last year, and I consider it to be one of TVM's unique strengths - Glad to see that it has found a new industrial use case.
Hi, Thank you so much for reaching out. TVM is a really great project with amazing features. We will definitely keep integrating/testing TVM. Please also let us know if there is anything we could do for the TVM community!
Closed because the PR is merged.