CINN icon indicating copy to clipboard operation
CINN copied to clipboard

Compiler Infrastructure for Neural Networks

Results 100 CINN issues
Sort by recently updated
recently updated
newest added

Compile cuda_c source code using nvcc by system call to generate ptx and cubin.

将graph.group中一些数据结构由shared_ptr改为weak_ptr,以防止fusion merge pass时造成循环引用。

使用weak ptr避免循环引用。

using system call to use nvcc compile cuda-c cdoe and generate ptx and cubin

支持matmul_v2_grad,对于`scale->gemm->scale`这种结构,反向可以省去两次scale操作。当seq_len比较大的时候,attention中q*k的输出矩阵比较大,单独执行scale,耗时也会比较高。