Anthony Chang

Results 16 issues of Anthony Chang

There seems to be an issue with current Nvidia and Intel platform (AMD does not seem to have this issue) when I compile Boost.Compute in a dynamic library; segfault is...

By specializing `template struct set_kernel_arg` when expanding BOOST_COMPUTE_ADAPT_STRUCT, kernel input argument can now take custom struct by value. This potentially utilizes constant memory of the compute device

Sometimes it is desirable to have CL-accelerated libraries initialize with user context managed by some outside application. But Boost.Compute currently hides global state of `context/device/queue` from users; there is no...

Include static_assert.hpp header in , required since commit 6c6a1089

In the official Boost.Compute [documentation](https://www.boost.org/doc/libs/1_69_0/libs/compute/doc/html/boost_compute/getting_started.html#boost_compute.getting_started.compilation_and_usage), the tutorial starts with how to get Boost.Compute running as succinctly as: `g++ -I/path/to/compute/include main.cpp -lOpenCL` But it won't work. It seems we also need...

In `DBoW3.h` there is reference to CC-ShareAlike license. At the same time it also refers to LICENSE.txt which clearly states a modified BSD license. So I am not sure which...

The issue comes from boostorg/compute#817 where `boost::trim()` is invoked to sanitize OpenCL kernel arguments from preprocessor-generated strings. One of the failing cases taken directly from the boost compute repo is...

Bare minimum batched multihead attention backward kernel. Many missing functionalities: - ~alpha(QK) scaling~ **implemented** - masking - dropout Some quirks that need to be ironed out too. Eg: - A/B/B1/C...

We do not have grouped att for now, @rosenrodt . @asroy Do we need instances for group bmm+softmax+gemm+permute _Originally posted by @shaojiewang in https://github.com/ROCmSoftwarePlatform/composable_kernel/issues/425#issuecomment-1252494368_

Some test cases fail due to mismatch between tensor's rank and accessor's rank. The trick here is to not specify CMAKE_BUILD_TYPE so the resulting compiler flag will not include -DNDEBUG....