biergaizi

Results 39 issues of biergaizi

When building the following empty Kokkos program using `-DKokkos_ENABLE_THREADS=1 -DKokkos_ENABLE_HWLOC=1` with GCC 13 or clang 16, system monitor shows 100% CPU usage across almost all cores (one core is idle,...

Question
Performance

**Describe the bug** A clear and concise description of what the bug is. I tried to use likwid-bench to measure memory bandwidth on a AMD Ryzen 5 3500X (Zen 2)...

bug

### Motivation I'm doing some micro-benchmarks on AMD GPUs to understand its performance characteristics in order to improve kernel performance. I'm now suspecting that different register allocation and instruction scheduling...

I'm trying to write a Userscript to change the behavior of websites by monkey-patching the built-in JavaScript functions, for example, replacing them with my custom versions to intercept their actions....

help wanted

Many GPUs contain special optimizations for texture memory, originally meant for 3D graphics. This usually involves storing 2D or 3D data in a device-specific tiled format to increase its data...

enhancement

**Bug summary** To hide latency or converse registers, OpenSYCL / LLVM aggressively interleaves arithmetic and memory instructions. Unfortunately, in many cases, these attempts reduce the regularity of memory access patterns...

bug

**Bug summary** Currently, OpenSYCL determines the path of Boost by reading `Boost_LIBRARY_DIR`. However, according to [CMake 3.2's documentation](https://cmake.org/cmake/help/v3.2/module/FindBoost.html), `Boost_LIBRARY_DIR` is not a result variable but an internal cache variable, and...

bug

I just investigated the performance of the current FDTD engine implementation, and I've identified two bottlenecks that are causing unnecessary slowdowns in simulations - at the same time, they're easy...

Recently some developers are interested in exploring different ways to speedup the FDTD engine, examples include my #100 proposal and #105 patch, and @MircoController's experimental CUDA kernel at https://github.com/thliebig/openEMS-Project/discussions/36. To...

For openEMS users, it would be insightful to know how different hardware and software configurations (CPU, memory, multi-threading, MPI clustering) affect simulation speed via a set of standard tests. However,...