yoyodyne
yoyodyne copied to clipboard
Testing
[copied from CUNY-CL/abstractness/issues/87]
We should add integration tests (I hesitate to call these unit tests), simply limiting ourselves to the model sizes and data quantities we can run on CircleCI's free tier. We get 6,000 compute-minutes per month...all of this is pretty generous except that I am unclear whether we can use their GPU images or are stuck on CPU (ideally we'd parameterize tests on both). I think it ought to be possible to do actual training of the major models using, say, 1,000 examples. Unit tests could include g2p (for feature-less) and inflection (for feature-full) from SIGMORPHON.
The current training and prediction functions are structured to read and write directly to the file system. They should be modularized to take ordinary arguments and return the results:
- for training, a function could simply return the best model (or its path) with metadata (wall clock time, training accuracy, development accuracy), and then the command-line enabled version of that loop could invoke this
- for prediction, a function could simply return the accuracy.
These functions can then be called by the existing (null return type) training and prediction functions, the ones parameterized with click flags.
This will also support two other projects (issues coming soon):
- benchmarking
- W&B-enabled hyperparameter sweeping
This is a blocker for a post-beta release candidate.