random-forest-importances
random-forest-importances copied to clipboard
Added multiprocessing for oob_importances (v2)
Continuation of PR #20 , addressing Issue #19
excellent. I will take a look this afternoon.
hi. okay, I tried it out and it does seem to get the same answers. The only problem is it takes longer for me. kept fundamentally, creating a separate process and having to copy the data over is going to be slower than single threading for anything other than small data sets. Apparently in Python 3.8, we are going to get proper shared memory so I think we should wait until then. I will keep this PR because it might simply work automatically or with a small tweak at the next release of Python. a quick check shows that they are in alpha 2 so it's unclear when it will come out. release candidate seems to be end of September 2019.
Okay, that makes sense. I'm just curious how much longer does it take and how large is your dataset? It'll be interesting to see what happens when python 3.8 gets released.
It's about 50% longer. my data set is about 100,000 records I think