Kyurae Kim
Kyurae Kim
I found out after painful debugging that dlib's `find_global_min` fails with a segmentation fault when the observed values contain `NaN`. What do you think about including a `NaN` checking procedure...
there's a new [subreddit for opensource libraries](https://www.reddit.com/r/cpp_review/). In case the library receives enough votes the library will get a *meeting C++ certification* I think rapidcheck deserves more users than it...
Hi, Would you guys be interested in an implementation on the adaptative HMC scheme proposed in ["Adaptive hamiltonian and riemann manifold monte carlo."](https://dl.acm.org/doi/10.5555/3042817.3043100) ? I'm currently writing an implementation to...
Hi, `Coupling` currently has an issue with differentiation. Here's a reproducible example. ```julia using Bijectors using Flux using ProgressMeter using StatsBase using StatsPlots using Turing using Zygote function main() n_iter...
mgcpp internally uses a lot of temporary but cudaMalloc has a really bad performance. Using low latency memory allocators should boost performance a lot. references. - [tcmalloc](http://goog-perftools.sourceforge.net/doc/tcmalloc.html), Google - [THC...
Implement Fast Fourier Transform CUDA kernel or add cuFFT into the library. ### preliminaries - Implement complex type matrix/vector
Make a efficient CUDA micro benchmark framework The current workflow of writing/optimizing CUDA kernels is very difficult because there is no proper, consistent way of measuring the performance of kernels....
Implement adapters for blaze, eigen, uBLAS, plain array, std::vector. adapters for blaze are partially implemented but are not tested and suffer a serious problem: memory padding.
Implement GPUless dummy test mode. By using stateful allocators and stubs, we might be able to run certain tests on systems without a GPU. In order to implement this, a...
Add parallel linear equation solvers, eigen solvers Primarily considering cuSOLVER, need to find a dependable 3rd party library or implement them ourselves