tvm The change allows to increase performance in multi-thread environment.

The change allows to increase performance in multi-thread environment.

Open shtinsa opened this issue 3 years ago • 2 comments

trafficstars

In this case data locality is improved and it may have positive effect to final inference in case of MT execution.

Thanks for contributing to TVM! Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.

Aug 05 '22 16:08 shtinsa

res_dlrm-0805_three The picture contains perf improvements in case of DLRM model (the points cloud was build for system with 48 cores)

Aug 05 '22 18:08 shtinsa

@jwfromm @tkonolige could you please review? the changes may increase memory usage but the same time it decrease data crashing inside of the cpu caches.

Aug 05 '22 18:08 shtinsa

The word "Global" means that it's a global singleton. Changing semantic to thread local will confuse developers.

Moreover, this patch works only because you initialise VirtualMachine modules form separate threads. That is not mandatory behaviour. Customer may create a set of VirtualMachine in main thread and after that assign them to worker sub threads.

This change is good enough to demonstrate possibility of advance memory management in case of multi instance execution. But it cannot be merged as is.

Aug 23 '22 09:08 apeskov

tvm tvm copied to clipboard

The change allows to increase performance in multi-thread environment.

tvm
tvm copied to clipboard