Petr
Petr
I have a solution for independent `3 + 3 + 2 + 3` logits. I will prepare a pull-request when I have time.
I'd like to say that there is more than `constexpr`. `explicit` and `noexcept` are also redefined. ```C++ #ifndef constexpr #define constexpr static const #endif #ifndef explicit #define explicit #endif #ifndef...
Hi @awjuliani Yes, I run this binary. I also figured out that running in headless mode (realtime_mode=False) makes seeds work properly according to UnitySDK.log.
I have exactly the same problem while trying to enable "Collect run-time types information for code insight". The PyCharm version is: ``` PyCharm 2016.3.3 Build #PY-163.15188.4, built on March 10,...
For block sizes, maybe we should look into `cudaOccupancyMaxPotentialBlockSize`. https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__HIGHLEVEL.html#group__CUDART__HIGHLEVEL_1gee5334618ed4bb0871e4559a77643fc1 This does the occupancy calculator for a given function.
@rosslwheeler yes, I had exactly the same issue. Seems like the kernel uses too many registers? Reducing block size to 512 in debug mode makes the code work.
1. Moved the stack collapse script into `dev/tools` 2. Removed `pandas` from `requirements.txt`