cloudhan

Results 33 issues of cloudhan

**Describe the bug** **Steps/Code to reproduce bug** ```cuda #include "cute/tensor.hpp" using namespace cute; __global__ void kernel() { constexpr auto weird = right_inverse(make_layout(_2{}, _1{})); print(weird); } int main() { kernel(); cudaDeviceSynchronize();...

bug
? - Needs Triage

**Describe the bug** As of b7508e337938137a699e486d8997646980acfc58, `Copy_Atom` cause misaligned address. **Steps/Code to reproduce bug** ```cuda #include using namespace cute; __global__ void kernel(int m, int k, float* a, int lda) {...

bug
inactive-30d
CuTe

**Describe the bug** `make_tiled_copy` also should not secretly pad `Thr` and `Val`. See code sample and discussion. **Steps/Code to reproduce bug** ```cpp #include using namespace cute; int main() { std::vector...

bug
CuTe

`/opt/rocm/.info/version-dev` is only available if the `rocm-dev` metapackage is installed. This will bring a lot of unused packages which are not needed by the users, they may opt for fine...

https://github.com/Jimver/cuda-toolkit/issues/315 Just wait for upsteam fix will be OK.

Hi, This is not an issue.. I'd like to inform the incomer that I am developing an [rules_cuda](https://github.com/cloudhan/rules_cuda.git) It has the following feature: 1. Pure Starlark implementation 2. Supports both...

Some chats from slack me: > @Gisle Dankel Do you have the context of why static linking is required on windows for libkineto? Gisle: > It’s not - in fact...

enhancement

With ~10 steps of resnet50 being profiled, there will be a roughly 2.3s freeze on 10900x machine. The reason of the freezing is because of **Recalculate Style** and **Layout**, which,...

bug
plugin