keras-nlp Skip unnecessary GPU testing

Our accelerator testing is taking too long, and there are some tests we don't necessarily want to verify on GPU specifically, such as saving testing.

Mar 29 '23 07:03 chenmoneygithub

Hi @chenmoneygithub I am interested in this issue, is it ok if I take this up?

Mar 29 '23 17:03 susnato

@susnato this looks like an issue @chenmoneygithub filed for himself to work on to me!

@chenmoneygithub one thing to keep in mind is our notion of test sizes. Currently we are switching to marking our saving tests with a "large" marker, because they are slow. And at lot of other slower test fit into that "large" bucket, like our basic preset testing.

It is definitely helpful to run these large tests each PR if possible (because the presets really do break due to contributions), and the GCP instances feel like a natural fit for that because they have more compute power.

Not sure what the right approach is, but a few things we may want to prioritize.

Preserving some basic save model testing per PR.
Preserving some basic preset testing per PR.
Avoiding adding two many pytest markers to the point where it is a maintenance burden.

Mar 29 '23 20:03 mattdangerw

@mattdangerw My feeling is the point of accelerator testing is to test GPU compatibility, but now we are mixing GPU testing with large testing (e.g., saving). Shall we create another CI only for large testing we want to cover per PR? Our GCP testing has the capacity limit, so it's not ideal to let it handle "large" test.

Mar 29 '23 23:03 chenmoneygithub

My feeling is the point of accelerator testing is to test GPU compatibility, but now we are mixing GPU testing with large testing (e.g., saving).

Talked offline, but I don't quite agree here. We need a GCP testing flow to be able to test "larger" test generally. Github action free machines will only get us so far. We need too be able to scale up the GCP testing flow as our project grows and our testing needs increase.

If our current accelerator testing stopped testing saved models and presets, we would lose coverage for both, which would be bad. Moving the saving and preset testing to github action default is a bad solution, as I think those machines are underpowered for the tests we want to run.

I could definitely see us needing to administer different types of GCP machines (GPU, CPU and TPU) down the road, but it feels like more headache than it is worth now.

Mar 30 '23 19:03 mattdangerw