Jacob Schreiber

Results 320 comments of Jacob Schreiber

Howdy Thanks for the responses. I've been out for a bit, but I'm looking into this more. When I said I wanted a cython function, I meant that I was...

I have updated my list. I think it is appropriate to tackle each issue with a single corresponding PR sequentially. I will submit a PR for the reorganization as soon...

``` BRANCH Classification performance: =========================== Classifier train-time test-time error-rate -------------------------------------------- RandomForest 79.8781s 0.3793s 0.0219 CART 15.0324s 0.0216s 0.0444 MASTER Classification performance: =========================== Classifier train-time test-time error-rate -------------------------------------------- RandomForest 37.2150s 0.4297s...

A style question I have is why we are using SIZE_t, DTYPE_t, DOUBLE_t, double, UINT32_t, INT32_t, unsigned char, and int, as different datatypes in these modules. It seems excessively confusing....

@arjoly I think I might've deleted your addition to the list about feature computations. If I did, can you re-add it at your convenience please?

I didn't mean physical merge them. I meant change all integers to be of type SIZE_t, rather than INT32_t, UINT32_t, unsigned char*, and int as well, and change all doubles...

I've added a new PR (#5278) which cleans up criterion and will make adding caching easier. It also means that all criteria store three pointers (node_sum, node_sum_left, and node_sum_right), which...

Also, PR #5252, regarding merging PresortBestSplitter and BestSplitter, is still awaiting review. I would like to have both #5252 and #5278 merged before adding caching across splits.

Apologies for disappearing temporarily. I should have more time for this now.

I'm halfway through the PR. However, research has struck temporarily, and so I've had far more limited time than I expected to work on this.