Estefania Barreto-Ojeda - Applications in ML Drug Discovery pipelines | PyData NYC 2022
Youtube url: https://www.youtube.com/watch?v=FbIyKjOdiI8
Suggested time stamps:
00:00 Welcome! 01:18 Overview. 02:03 Part I: Biological data. 03:13 Complexity of biological data. What makes biological data different? 12:49 Overview ML drug discovery pipelines. 15:20 Challenges in ML drug discovery pipelines. 17:12 Part II Implementing data version control in Drug Discovery pipelines. 17:20 Introduction to DVC. 19:13 Installing and initializing DVC. 21:24 Set DVC remote. 22:36 Versioning files with DVC. What does dvc add do? 25:21 Implementing DVC in Drug Discovery pipelines - Demo. 27:47 Data versioning. 28:17 Build a DVC ML pipeline. 28:30 Build a DVC ML pipeline - Featurization stage. 32:28 Initial Directed Acyclic Graph (DAG). 32:50 Build a DVC ML pipeline - Processing stage 34:12 Run ML pipelines with DVC repro. 34:50 What is a dvc.lock file? 35:48 Build a DVC ML pipeline - Training+Metrics stage 38:42 Final DAG. 40:07 Highlights.
Thank you! :)