ml4bio-workshop icon indicating copy to clipboard operation
ml4bio-workshop copied to clipboard

Guidance on how to get started and where to go next

Open agitter opened this issue 4 years ago • 0 comments

In the surveys, some participants wanted to know more about how to assess whether they can get started on ML for a real dataset. For instance, how many samples are needed? How can one assess whether they have enough data for classification? It depends on many factors, but we could give some general advice.

In addition, they had questions about how to scale up and move to larger datasets. We can use the Jupyter notebook to illustrate moving from the ml4bio software to Python code. Then we can describe the need for batch or high-throughput analyses to explore classifiers and hyperparameters at scale in a real research setting.

We could also refer to some of the extra datasets we've prepared that aren't used in the workshop so that participants can continue to explore and learn on their own.

agitter avatar Feb 14 '20 19:02 agitter