Yinghai Lu
Yinghai Lu
@842974287 I think we should try to compile/screen tf code to make sure it works for tf. For example, to avoid fixes like https://github.com/NVIDIA/FasterTransformer/commit/55c6c6955e1975b8866a3b6f74c1f847d3d9ee9a
If the whole graph is lowerable, what's the difference of this and using fusion?
@jackm321 can you take a look?
@frank-wei could take a look? It seems quite unexpected.
@carlushuang Hey sorry about the breakage. I wonder if it makes sense to add a rocm CI to guard it.
Hi, If you mean inference deployment, yes, enforcing nvcc availability for online compilation is quite a problem. What we usually do is to compile the .so file ahead of time...
I don't think current integration support CUDA now. But we have something WIP. @ilia-cher
Yeah, we currently don't have plans for that. If you are interested, please add a patch. :)
``` /Users/serhaty/local/tvm/torch_tvm/register.cpp:124:14: error: conversion from 'tvm::runtime::TVMArgValue' to 'size_t' (aka 'unsigned long') is ambiguous size_t id = args[0]; ``` seems legit
Does this clash with `import tvm`?