Remove 64 threads limit
Currently TMC is limited to 64 threads due to the implementation of certain thread information as atomic bitsets, which are 64 bits in size. This could be resolved by either implementing these as expandable bitsets (vector of size_t). Another possibility would be to change the current SoA design to an AoS design where the sleep/wake/priority interrupt info for each thread is kept together. This may actually result in a performance increase by reducing thread sharing.
What happens right now if TMC is used on a machine with more than 64 Cores?
The number of threads in a single ex_cpu will be clamped to 64. This happens if you call set_thread_occupancy() or use the automatic thread configuration (with, or without, hwloc).
If you call set_thread_count(128) there is a debug assert that will fire; however, in release mode, this assert is disabled and the number of threads will be clamped to 64.