BERT example integration test
I think we are at the point where we need some automated testing for this. This runs preprocessing, a few pretraining train steps, and a few finetuning train steps by invoking our scripts on real data.
@chenmoneygithub @fchollet let me know what you think of this.
We definitely need some sort of automated testing here. I think this could be a good template for integration tests for our examples--literally run all the scripts on some real data we host and download in the test.
For just running 5 pretraining steps and 3 finetuning steps on a tiny version of the architecture take ~7 minutes on the stock github testing machines (CPU only). As we figure out a way to test on accelerators, we could definitely do a bit more training here.
I think I also like this as a forcing function for simple "out of box" use. Needing to write a single, smallish test that runs your whole training pipeline is tricky, but it will force people to avoid sneaking in manual steps to get things working.
Talked with @fchollet on this, we should do a few things.
- Move as much logic as possible out of the runnable script files into
bert_model.py(and potentially add abert_data.py, others as needed). Runnable scripts should be very basic and only transforming flags into function/class arguments. - Change this integration test to run the end-to-end flow by invoking functionality from
bert_model.py. If we've done things right this should still be a small and readable test. - Leave the runnable scripts untested on ci for now. Never test them through pytest. Maybe someday test them by invoking them directly for limited training runs.
Will try to find some time to redo this with those changes in mind.