keras-nlp icon indicating copy to clipboard operation
keras-nlp copied to clipboard

Try running network_tests on our GCP CI

Open mattdangerw opened this issue 3 years ago • 4 comments

mattdangerw avatar Sep 14 '22 23:09 mattdangerw

Actually giving it a second thought - we may not want to trigger network_tests at every request? Currently the network_tests are still small, but in the future it could be very heavy.

chenmoneygithub avatar Sep 15 '22 05:09 chenmoneygithub

Thank! Yeah was wondering about that too. Currently it only increases the test time from ~7 minutes to ~9 minutes, which is not a big deal, but I think maybe the right thing to do is to only run a smoke test where we download the smallest checkpoints for each model.

That gives us some coverage for simple breakages, but leaves us a lot more scalable.

I can rejigger the testing setup tomorrow to allow for that.

mattdangerw avatar Sep 15 '22 06:09 mattdangerw

@jbischof thanks! Replies below...

Isn't some code required to link the new flag to our GCP infra?

Yes, the single line in .cloudbuild/unit_test_jobs.jsonnet does this.

Why is only one tiny test moved over?

Basic reasoning is here https://github.com/keras-team/keras-nlp/pull/357#issuecomment-1247622222 (in response to Chen's comment above it).

I would love for us to get some coverage here running on each PR, and I'm not sure that downloading all 10G of weights is very scalable. So this was an attempt to get some automated coverage without forcing us to download the various XL variants of models.

How do GCP flagged tests work with local development? Will I be able to run this on my dev box?

In the current setup from this PR (which we can change). You can run normal testing, with no network tests, with pytest. You can run testing with a small amount of network testing with pytest --run_gcp. You can run everything with pytest --run_slow.

Feel free to propose some new names or a different split here! I figured this one might spark discussion.

My main goal is to get some coverage for the pretrained flows (a smoke test), running on each PR. So we avoid things like this https://github.com/keras-team/keras-nlp/pull/356. Open to suggestions on other lightweight approaches here!

mattdangerw avatar Oct 12 '22 00:10 mattdangerw

@jbischof --run_slow is still comprehensive. It will run everything tagged with --run_gcp as well.

I wonder if there is a better way I can preset that so it is more clear. Maybe we should just have the two tags be slow and really_slow or something, so the relationship is more clear. We control with code how cli argument map to test tags, so everything is possible here I think.

mattdangerw avatar Oct 12 '22 18:10 mattdangerw

Feel free to merge this one whenever.

jbischof avatar Oct 24 '22 23:10 jbischof

I've updated this to move testing inside the keras_nlp/models/xx directory. And changed the flag names so we just have the notion of test sizes (currently large and extra_large), which we can map to our testing infrastructure as we see fit.

This is ready for another pass!

mattdangerw avatar Oct 27 '22 18:10 mattdangerw

Looks very clean, thanks @mattdangerw!

jbischof avatar Nov 01 '22 17:11 jbischof

One alternative is that you could distribute the preset tests to existing test files; e.g., put the backbone preset tests with bert_models_test.py, the preprocessor preset tests with bert_preprocessing_tests.py, etc. This might maximize the change that someone updating a class updates the relevant test.

jbischof avatar Nov 01 '22 17:11 jbischof

I guess my thinking here was that it is nice if the presets and preset testing could be delivered as a unit. We can definitely see today how we will land preset as a follow up (pending approvals). It's nice if the bulk of that deliverable can come in as a set of separate files without many merge conflicts.

xx_presets.py
xx_presets_tests.py

The nice thing about the "smoke test" we will run on each PR is it should suss out failures to the general mechanism. And when you are updating the weights completely with a new version, you will probably update these files as a set.

mattdangerw avatar Nov 02 '22 17:11 mattdangerw

OK, ship away!

jbischof avatar Nov 02 '22 18:11 jbischof