keras-nlp BERT example integration test

I think we are at the point where we need some automated testing for this. This runs preprocessing, a few pretraining train steps, and a few finetuning train steps by invoking our scripts on real data.

May 25 '22 18:05 mattdangerw

@chenmoneygithub @fchollet let me know what you think of this.

We definitely need some sort of automated testing here. I think this could be a good template for integration tests for our examples--literally run all the scripts on some real data we host and download in the test.

For just running 5 pretraining steps and 3 finetuning steps on a tiny version of the architecture take ~7 minutes on the stock github testing machines (CPU only). As we figure out a way to test on accelerators, we could definitely do a bit more training here.

May 25 '22 18:05 mattdangerw

I think I also like this as a forcing function for simple "out of box" use. Needing to write a single, smallish test that runs your whole training pipeline is tricky, but it will force people to avoid sneaking in manual steps to get things working.

May 25 '22 20:05 mattdangerw

Talked with @fchollet on this, we should do a few things.

Move as much logic as possible out of the runnable script files into bert_model.py (and potentially add a bert_data.py, others as needed). Runnable scripts should be very basic and only transforming flags into function/class arguments.
Change this integration test to run the end-to-end flow by invoking functionality from bert_model.py. If we've done things right this should still be a small and readable test.
Leave the runnable scripts untested on ci for now. Never test them through pytest. Maybe someday test them by invoking them directly for limited training runs.

Will try to find some time to redo this with those changes in mind.

Jun 02 '22 00:06 mattdangerw