tvm
tvm copied to clipboard
Update scan.py to fix pascal error
https://github.com/mlc-ai/mlc-llm/issues/3231
When I use mlc-llm, I encounter error in linking. I solved the problem with this patch. This patch may not solve the root cause. But for my scene it is suitable, the performance loss is at the noise level. The root cause may be the lack of relevant instructions in Pascal, or a bug in Nvidia's Thrust library. This patch is intended as an emergency mitigation. Looking forward to a better way.
When using pascal, can_use_thrust(target, "tvm.contrib.thrust.sum_scan") returns True, but will actually fail.
Could you include sm_61 as well. NVIDIA Tesla P40 has the same issue