Lehman Garrison
Lehman Garrison
I think that rolling out a `last_mu_bin_closed` flag for just the two `DDsmu` CFs is fine. I think that perhaps the flag could even be on by default, as it's...
I can't seem to reproduce this problem. Here's my attempt: ```python import numpy as np from Corrfunc.theory.DD import DD box = 10. X, Y, Z = np.random.rand(3,0)*box bins = np.linspace(0.1,...
Not sure... I'm worried that static scheduling will hurt the performance for more clustered cases. I'm also not sure "hiding" the problem with static scheduling is a good idea, even...
Speaking of detecting the race condition, maybe it's worth trying the GCC 9.2 thread sanitizer on the Python extension? The memory/address sanitizers were always hard to run because of all...
Another idea: maybe worth trying `OMP_NESTED=true` and `OMP_NESTED=false`? I don't think we're (intentionally) using any nested parallelism, but if somehow we are, it could mess with the thread IDs... An...
Thanks for the report! Glad you have a workaround, but we should get gcc 5 to work anyway. I don't have a gcc 5 environment immediately available, but let me...
Probably related to #183
@dstndstn Could you share the output of `gcc -march=native -dM -E - < /dev/null | grep AVX512` with your gcc 5 compiler? My best guess is that the CPU supports...
Thanks. So indeed, despite the CPU (and supposedly the compiler??) supporting AVX512VL, the compiler is not trying to use it. @manodeep Maybe there's a way to detect this scenario manually...
@manodeep If we can detect the scenario, it seems we should just fix it ourselves my adding the `-mavx512vl` flag. The flag is mentioned in the release notes: https://gcc.gnu.org/gcc-5/changes.html But...