dome-ml icon indicating copy to clipboard operation
dome-ml copied to clipboard

Code: License: GPL v3 Data: License: CC BY 4.0

DOME: Recommendations for supervised machine learning validation in biology

Authors and Original Manuscript

Ian Walsh, Dmytro Fishman, Dario Garcia-Gasulla, Tiina Titma, Gianluca Pollastri, The ELIXIR Machine Learning focus group, Jennifer Harrow, Fotis E. Psomopoulos, & Silvio C.E. Tosatto

About DOME Recommendations

Modern biology frequently relies on machine learning to provide predictions and improve decision processes. There have been recent calls for more scrutiny on machine learning performance and possible limitations.

The aim of these community-wide recommendations is to help establish standards of supervised machine learning validation in biology, by adopting a structured methods description for machine learning based on Data, Optimization, Model and Evaluation (DOME). The recommendations are formulated as questions to anyone wishing to pursue implementation of a machine learning algorithm. Answers to these questions can be easily included in the supplementary material of published papers.

About This Repo

Our goal is to act as a single point of reference for best practices, guidelines and recommendations for Machine Learning in Life Sciences. The current set of recommendations are made primarily for the case of supervised learning in biology in the absence of direct experimental validation, as this is the most common type of ML approach used.

The data (that also includes a YAML form of the DOME recommendations) are under a CC BY 4.0 License. The code that parses the YAML in order to produce a tabular output as an excel file, is under a GNU GPL v3 License

Our goal is to expand and extend the DOME recommendations to other fields of ML, like unsupervised, semi-supervised and reinforcement learning, as well as other Life Science domains.

As we gather feedback, and as the field evolves, we plan to publish comprehensive updates to the DOME recommendations.

The DOME Recommendations

The DOME machine learning summary table and examples for it can be found in the /data directory.

CodeOcean capsule is available here.

More info to be added here