Poom Chiarawongse
Poom Chiarawongse
Thanks for raising this. It may take a long time, because we are in the process of overhauling the library, but I will try to make accommodations for this issue.
How about this? We change the structure in such a way that storing which leaf the labels end up in can be optional. The advantage of doing this is that...
@bensadeghi can you recommend a dataset for testing the memory usage when writing trees to disk?
@bensadeghi Sadly neither BSON nor JLD currently support 0.7-beta, so I think we will have to wait a little before the new Leaf struct can be implemented. How do you...
Hmm, now that you say it, I agree with you on keeping the current packages backwards compatible. There really are no reasons on forcing them to be `Int64`. As a...
Do you think `Float64` will pose a problem for 32-bit architectures too? If that's the case, can you suggest an architecture-dependent fixed-sized alternative like `Int`?
Discrete entropy is bounded by the log of the number of possible states. So we can squish the range of classification's pruning purity to be in [0, 1]. However, I...
I've an idea. How about we change the purity from negative of variance to the proportion of explained variance. so instead of using `-var(leaf)` as the purity, we use `(var(tree)...
@bensadeghi I realized that `prune_tree` can be done during the tree building. We can add another argument to `build_tree` called `min_impurity_split` which should handle this kind of thing and be...
I see how classification trees can be compactified using the data structure from `NodeMeta`. However, the case seems a little harder for regression trees where there may be as many...