DecisionTree.jl icon indicating copy to clipboard operation
DecisionTree.jl copied to clipboard

VFDT's based on this package?

Open robertfeldt opened this issue 7 years ago • 4 comments

Hi, I'm interested in a Julia implementation of Domingo's VFDT's aka "Hoeffding Trees", see, for example:

http://weka.sourceforge.net/doc.dev/weka/classifiers/trees/HoeffdingTree.html

This is a streaming algorithm for learning decision trees and might be very useful for modelling "big data" such as logs etc.

Are there any plans for implementing streaming algorithms within this package? If not do you think it is feasible on top of the infrastructure provided here, or would a clean/separate implementation/package be better?

Thanks for any input you might have.

robertfeldt avatar Dec 03 '17 11:12 robertfeldt

I have also missed something like that in Julia. But perhaps it would fit better to OnlineStats.jl?

ValdarT avatar Dec 03 '17 11:12 ValdarT

Agreed, although my feeling is that the types and methods available in this package (DT) might be needed for a VFDT implementation and, in some sense, it is not as clear that a VFDT has O(1) memory requirement (since the tree might grow very large).

robertfeldt avatar Dec 03 '17 11:12 robertfeldt

What is your opinion on this, @joshday? Would VFDT fit the scope of OnlineStats.jl or would it better fit elsewhere?

ValdarT avatar Dec 28 '17 11:12 ValdarT

Yes, it would fit in the scope of OnlineStats. I actually started working on it recently, but @robertfeldt makes a good point that before it's done I'll probably be reinventing some data structures that exist here. I think that's fine at least in the short term.

joshday avatar Dec 28 '17 13:12 joshday