Possible speed-up for emcee implementation?
Hi all,
On the emcee documentation, they suggest to limit the use of args in EnsembleSampler for parallel implementations that have large data-sets or large objects, since they end up being re-pickled every iteration. See: https://emcee.readthedocs.io/en/stable/tutorials/parallel/
To me, it looks like a lot of stuff is being passed to EnsembleSampler through kwargs (which I assume behaves in a similar fashion?). See: https://github.com/phoebe-project/phoebe2/blob/7b1d73f751db49fd63511ac154245ab8c1c373bd/phoebe/solverbackends/solverbackends.py#L1648
I don't know how big of an issue this is, since I'm not really familiar with the way PHOEBE is structured internally. I just wanted to mention it, in case it could be useful to improve performance.
All the best, Jeppe
I suspect the overhead here is minimal compared to other bottlenecks in phoebe, but is definitely worth investigating/profiling at some point and consider changing the implementation. This probably isn't a high priority, but I'll leave it open so that we can revisit it in the future. Thanks!