galaxy
galaxy copied to clipboard
add more multithreading to code
utilize thread_local and optimize usage of std::execution::par to fix perf bottlenecks