Adam Lugowski

Results 30 comments of Adam Lugowski

> I noticed the comment for M1 and thought it could cause some of the levelling off; I believe M1's always have 4 efficiency cores, not 2. Mine is 6...

Note about large p, like 32 or 64. Those sizes can uncover some non-obvious sequencing. A big one is allocators, the standard lib's allocator is thread safe but it can...

I've implemented a way to parallelize only some binops, and enabled that for plus, minus, and multiply only. That reduces the compilation time and so I enabled the other parallel...

I know I'm a broken record about library size, but even on the `main` branch binops account for half of `_sparsetools.so` size (~2MB worth), and about 5% of overall `python...

I've simplified the PR based on feedback: * Parallelize only matmul, sort_elements, and matvec. These have obvious payoff. * No parallel binops. See #19765 for a sequential binop optimization. *...

> @alugowski one other question: did you consider fork-safety of the threadpool? The pocketfft threadpool implementation uses `pthread_atfork` shutdown/restart to handle this (hat tip to @peterbell10 for pointing this out)....

> Looks like there's a crash in `_mul_sparse_matrix` in the Windows job that builds sdist/wheel (so uses `-O3`, unlike the `dev.py` builds). The last two test fails were true headscratchers....

> If C++ parallel has a wait policy, it could end up stalling when switching between different parallel contexts: [OpenMathLib/OpenBLAS#3187](https://github.com/OpenMathLib/OpenBLAS/issues/3187) and [OpenMathLib/OpenBLAS#3187 (comment)](https://github.com/OpenMathLib/OpenBLAS/issues/3187#issuecomment-940999630) > > In the issue, the example...

I wish I could offer more than "Controller doesn't work in that context". No stack traces, just that the controller doesn't call the `set_num_threads` method as one would expect. If...

Ah, I'll try to summarize: threadpoolctl uses methods from libc to get the list of loaded libraries that are then matched against registered controllers. If there is no compatible libc...