DecisionTree.jl icon indicating copy to clipboard operation
DecisionTree.jl copied to clipboard

Allow features to be `AbstractVector` (in addition to `AbstractMatrix`)

Open sanderbboisen opened this issue 3 years ago • 6 comments

I was playing around with DecisionTree.jl for a potential project and found that I could not produce a regression tree from a single feature. I have made changes to allow features to be a vector in the build_tree functions.

I have tested the changes using RunTest.jl with Julia 1.6.4 .

sanderbboisen avatar Feb 23 '22 10:02 sanderbboisen

I don't see any harm in this enhancement, apart from a mild complication in the codebase.

Anyone else have objections?

@sanderbboisen Would you be willing to add a test of native and ScikitLearn API with a vector X ?

ablaom avatar May 06 '22 01:05 ablaom

I don't see any harm in this enhancement, apart from a mild complication in the codebase.

Anyone else have objections?

@sanderbboisen Would you be willing to add a test of native and ScikitLearn API with a vector X ?

Sure, I will get on it. Incoming update.

sanderbboisen avatar May 13 '22 17:05 sanderbboisen

Codecov Report

:exclamation: No coverage uploaded for pull request base (dev@17cb46d). Click here to learn what that means. The diff coverage is n/a.

@@          Coverage Diff           @@
##             dev     #150   +/-   ##
======================================
  Coverage       ?   88.84%           
======================================
  Files          ?        9           
  Lines          ?      986           
  Branches       ?        0           
======================================
  Hits           ?      876           
  Misses         ?      110           
  Partials       ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 17cb46d...070e86e. Read the comment docs.

codecov-commenter avatar May 16 '22 01:05 codecov-commenter

I ran into an issue as several functions like apply_something were overloaded so that any Vector inputs worked differently. These are essentially only called by theapply_something functions that users are supposed to call. I have taken the path of least modification, and named them _apply_something. This way users only have access to the functions they are intended to use, and using Vectors as inputs work.

I have added tests to tests/scikitlearn.jl and everything seems to work.

I am sorry for the confusing commit structure, things got a little messy at some point.

I think there is a general issue with the way that the module is constructed, and I have been refactoring it in a separate branch on my fork. For one, the structure does not allow multiple usage of utils.jl and there is generally some issues with code repetition iirc. I will open another PR on this when I am done refactoring.

sanderbboisen avatar May 18 '22 12:05 sanderbboisen

I am happy with this now.

I suspect this smaller PR may create a few merge conflicts for the more complicated #166, so let's wait for:

  • [x] #166 is merged

before merging this one.

@sanderbboisen Thanks for your contribution and patience.

ablaom avatar Jun 01 '22 04:06 ablaom

@sanderbboisen Would you now be happy to rebase your PR?

ablaom avatar Jul 01 '22 00:07 ablaom