ml4bio-workshop icon indicating copy to clipboard operation
ml4bio-workshop copied to clipboard

Modify example datasets

Open agitter opened this issue 4 years ago • 7 comments

There were a few instances were our sample datasets did not give the desired outcome, which made it hard to impress the points we wanted to make about hyperparameter or model selection:

  • Decision tree gave perfect accuracy on the example data
  • Logisitic regression did not give a solution with L1 regularization that ignored one of the features

Part of the challenge may be the random data splitting. Do we need to introduce an explicit seed? Would that help introduce reproducibility concepts or complicate the workflow too much?

agitter avatar Feb 03 '20 17:02 agitter