yoyodyne icon indicating copy to clipboard operation
yoyodyne copied to clipboard

Testing

Open kylebgorman opened this issue 2 years ago • 1 comments
trafficstars

[copied from CUNY-CL/abstractness/issues/87]

We should add integration tests (I hesitate to call these unit tests), simply limiting ourselves to the model sizes and data quantities we can run on CircleCI's free tier. We get 6,000 compute-minutes per month...all of this is pretty generous except that I am unclear whether we can use their GPU images or are stuck on CPU (ideally we'd parameterize tests on both). I think it ought to be possible to do actual training of the major models using, say, 1,000 examples. Unit tests could include g2p (for feature-less) and inflection (for feature-full) from SIGMORPHON.

The current training and prediction functions are structured to read and write directly to the file system. They should be modularized to take ordinary arguments and return the results:

  • for training, a function could simply return the best model (or its path) with metadata (wall clock time, training accuracy, development accuracy), and then the command-line enabled version of that loop could invoke this
  • for prediction, a function could simply return the accuracy.

These functions can then be called by the existing (null return type) training and prediction functions, the ones parameterized with click flags.

This will also support two other projects (issues coming soon):

  • benchmarking
  • W&B-enabled hyperparameter sweeping

This is a blocker for a post-beta release candidate.

kylebgorman avatar Dec 09 '22 17:12 kylebgorman

Yoyodyne test strategy.pdf

The above describes my current thinking about the test strategy.

kylebgorman avatar Dec 30 '22 21:12 kylebgorman