Wuwei Lin

Results 33 comments of Wuwei Lin

It doesn't exist yet. It will use arith analysis or use some annotation as hint. For simple layout conversion like reordering / packing axes, they should be well supported because...

Thanks for the discussion. @psrivas2 There are no analysis to check if a PrimFunc is layout agnostic. After graph op lowering, there are no such layout information in PrimFunc (unless...

@psrivas2 exactly, it can be made equivalent

`As a result, expressions such as R.zeros([16], "int32") would be extracted out into the parameter transformation, even though they do not depend on any parameters. ` Does this affect the...

Here is an example, the number of weights are large enough to see the difference (1.2ms vs 6.2ms) https://gist.github.com/vinx13/ea7a8c785d8d0d5ae5318e8ace085db2

@masahi this is probably the case, it didn't happen for this kernel before though

we can move gpu builder to larger cpu instance if needed

@ysh329 a tag will be created automatically if you create a release on GitHub

@ysh329 you may need to setup your github account following https://cwiki.apache.org/confluence/display/OPENWHISK/Accessing+Apache+GitHub+as+a+Committer

The error `CUDA_ERROR_NO_BINARY_FOR_GPU` is likely ude to a mismatch of the cuda arch, you can try specifying the arch in the target