Language-Model-SAEs icon indicating copy to clipboard operation
Language-Model-SAEs copied to clipboard

[Proposal] Add Automatic (Unit) Testing and CI Workflows

Open dest1n1s opened this issue 1 year ago • 1 comments

Automatic testing is fundamental to keep a collaborative developed project from endless bugs corrupting modules that originally work. As for a deep learning library, always running the whole training or analyzing process from the outermost can consume lots of time and computational resources. Minor bugs may also not be triggered in a fixed training setting. Thus, it's necessary to test at different levels to ensure proper functioning as much as possible.

I propose adding the following 4 categories of testing:

  • Unit testing: Testing if every innermost method works well with mock data, e.g. a single forward pass in a minimal SAE, a single generation of activation. Unit testing should cover almost all parts of the library, so every single test is required to run fast.
  • Integrated testing: Testing if low-level modules work with one another properly, e.g. getting feature activation directly from text input (needs co-working of transformers and SAEs), a single training pass, and loading pretrained SAEs from HuggingFace. These tests should cover the common usage of the library at a rather high level. It also requires an acceptable time cost (maybe no more than several seconds). These tests should not depend on GPUs if possible.
  • Acceptance testing: Testing if modules work with a high performance (loss, memory allocated, time cost), e.g. if a pretrained SAE gives a reasonable loss. Some of these tests may require GPUs to run. Failure of these tests may be acceptable in some situations.
  • Benchmarks: Testing the time usage of a complete process and some bottleneck modules.

Continuous Integration (CI) with GitHub workflows should also be added to run testing on every push/PR. PRs should not be merged unless all tests are passed.

dest1n1s avatar Jun 08 '24 09:06 dest1n1s

Tracking: @Frankstein73 has added CI workflows to automatically run tests on pushing/making PRs to main/dev branch. Feel free to complete the missing tests for all modules!

dest1n1s avatar Jun 20 '24 12:06 dest1n1s

Now the unit & integration tests could cover most methods in training/generating activations/analyzing. Close this.

dest1n1s avatar Jan 14 '25 16:01 dest1n1s

You did not forget where you started. Great engineers do great jobs

Hzfinfdu avatar Jan 14 '25 17:01 Hzfinfdu