DecisionTree.jl
DecisionTree.jl copied to clipboard
VFDT's based on this package?
Hi, I'm interested in a Julia implementation of Domingo's VFDT's aka "Hoeffding Trees", see, for example:
http://weka.sourceforge.net/doc.dev/weka/classifiers/trees/HoeffdingTree.html
This is a streaming algorithm for learning decision trees and might be very useful for modelling "big data" such as logs etc.
Are there any plans for implementing streaming algorithms within this package? If not do you think it is feasible on top of the infrastructure provided here, or would a clean/separate implementation/package be better?
Thanks for any input you might have.
I have also missed something like that in Julia. But perhaps it would fit better to OnlineStats.jl?
Agreed, although my feeling is that the types and methods available in this package (DT) might be needed for a VFDT implementation and, in some sense, it is not as clear that a VFDT has O(1) memory requirement (since the tree might grow very large).
What is your opinion on this, @joshday? Would VFDT fit the scope of OnlineStats.jl or would it better fit elsewhere?
Yes, it would fit in the scope of OnlineStats. I actually started working on it recently, but @robertfeldt makes a good point that before it's done I'll probably be reinventing some data structures that exist here. I think that's fine at least in the short term.