FR: project structure skeleton
Feature request: create the skeleton of a project structure that integrates the following:
- jupyter workbench development environment
- trainer - code for custom training
- predictor - code for custom prediction
- shared featurizer (preprocessing module for trainer and predictor)
- Dockerfile(s) for both a custom trainer and predictor along with cloudbuild.yaml(s)
Key goals:
- avoid code duplication (e.g. copying featurizer b/t trainer and predictor)
- simple project structure
- follow / demo ML ops best practices
You might also create a video that demonstrates the development workflow.
Motivation: there are many guides that, in isolation, provide good coverage of a component, but they are always outside the context of an actual practical development workflow.
It might look something like this (?):
├── README.md
├── setup.py
├── notebooks
│ ├── explore.ipynb
│ └── prototype_model.ipynb
├── predictor
│ ...
├── common
│ ├── utils.py
└── trainer
├── Dockerfile
├── build.ipynb
├── cloudbuild.yaml
└── src
├── __init__.py
├── dev get training data bq.ipynb
├── features.py
├── requirements.txt
├── sql
│ └── train_data_gen.sql
├── train.py
Hello @nxorable , This is a great idea! I am working towards something very similar but have a few hurdles to work through first. My plan is to make this type of shift as I introduce a full MLOps development perspective. At that point most of what is currently in the repository is really just training or serving with occasional devops pieces to enable that. The next phase includes wrapping these pieces for automation and triggering. More to come! I am going to leave this issue/fr here as a placeholder for what is to come. After the shift I would love your thoughts. Thank You, @statmike
Great work