cutt
cutt copied to clipboard
CUDA Tensor Transpose (cuTT) library
Hi, I am running into this problem when constructing **cuttPlan**: ` cuttPlan(&m_rot_plan[0], 3, dim_0, permu_0, sizeof(int), nullptr);` and I also used vigrind to test it. The relevant msg is: `==8915==...
Hi, many thanks for your library - it seems to be a really useful tool for GPU codes! I am testing it on Summit and find the following error: cudaFuncSetSharedMemConfig(transposePacked,...
Would you like to wrap any pointer data members with the class template “[std::unique_ptr](https://en.wikipedia.org/wiki/Smart_pointer#unique_ptr "Description for the usage of smart pointers")”? Update candidates: - [TensorC class](https://github.com/ap-hynninen/cutt/blob/4c251c60e38f9c8a61a9ebbb3630d8ab01ba9da1/src/cuttplan.cpp#L139) - [TensorTester class](https://github.com/ap-hynninen/cutt/blob/4c251c60e38f9c8a61a9ebbb3630d8ab01ba9da1/src/TensorTester.cu#L105)
==21682== Conditional jump or move depends on uninitialised value(s) ==21682== at 0x41E27D: computePos0(int, int const*, int const*, int const*, int const*, int*, int*) (cuttGpuModel.cpp:249) ==21682== by 0x41E429: computePos0(int, TensorConvInOut const*,...
Output is empty when one of dims is 1, such as ` int dim[4] = {W, H, C, N}; int permutation[4] = {3, 0, 1, 2}; cuttHandle handle; cuttPlan(&handle, 4,...
Hello, I use your cutt to do transpose, but I have encountered a problem---'Illegal instruction (core dumped)'. My code is `int main() { // Four dimensional tensor // Transpose (31,...
cuTT now supports tensor transpositions of the form: B_\perm(i0, i1, ...) = alpha * A_{i0,i1,...} + beta * B_\perm(i0,i1,...) with alpha and beta being scalars. TODO: support for alpha and...