grover icon indicating copy to clipboard operation
grover copied to clipboard

feat: Set up comprehensive Python testing infrastructure with Poetry

Open llbbl opened this issue 6 months ago • 0 comments

Set up Python Testing Infrastructure

Summary

This PR establishes a comprehensive testing infrastructure for the Grover project using Poetry as the package manager and pytest as the testing framework. The setup provides a ready-to-use testing environment where developers can immediately start writing tests.

Changes Made

Package Management

  • Configured Poetry as the primary package manager via pyproject.toml
  • Migrated existing dependencies from requirements-gpu.txt and requirements-tpu.txt
  • Updated dependency versions for better compatibility

Testing Framework

  • Added pytest, pytest-cov, and pytest-mock as development dependencies
  • Configured pytest with:
    • Test discovery patterns for test_*.py and *_test.py files
    • Coverage reporting with 80% threshold
    • HTML and XML coverage output formats
    • Custom test markers: unit, integration, and slow
    • Strict mode with helpful output formatting

Directory Structure

tests/
├── __init__.py
├── conftest.py          # Shared fixtures and configuration
├── test_setup_validation.py  # Infrastructure validation tests
├── unit/
│   └── __init__.py
└── integration/
    └── __init__.py

Fixtures (in conftest.py)

  • temp_dir: Temporary directory for file operations
  • mock_config: Mock configuration dictionary
  • sample_json_data: Sample JSON test data
  • sample_jsonl_file: Creates temporary JSONL files
  • mock_model_checkpoint: Mock TensorFlow checkpoint structure
  • mock_vocab_files: Mock vocabulary files for tokenization
  • sample_model_config: Model configuration matching project format
  • environment_variables: Common test environment setup
  • cleanup_tensorflow: Automatic TensorFlow resource cleanup

Additional Configuration

  • Updated .gitignore with testing artifacts and Claude settings
  • Excluded poetry.lock from gitignore (should be committed)
  • Added Poetry script commands for running tests

Running Tests

With Poetry (recommended):

# Install dependencies
poetry install

# Run all tests
poetry run test
# or
poetry run tests

# Run with specific markers
poetry run pytest -m unit
poetry run pytest -m integration
poetry run pytest -m "not slow"

Without Poetry:

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Install test dependencies
pip install pytest pytest-cov pytest-mock

# Run tests
pytest
pytest -v -m unit

Test Output

All validation tests pass successfully:

  • 17 tests validating infrastructure setup
  • All fixtures work correctly
  • Test markers function as expected
  • Coverage reporting generates properly

Notes

  • The project uses TensorFlow 1.x, so dependency versions are constrained accordingly
  • Coverage is configured to monitor lm, sample, and discrimination packages
  • The 80% coverage threshold applies when writing actual unit tests
  • Validation tests verify the infrastructure works but don't count toward coverage

Next Steps

Developers can now:

  1. Write unit tests in tests/unit/
  2. Write integration tests in tests/integration/
  3. Use the provided fixtures for common test scenarios
  4. Run tests with coverage reporting to ensure code quality

llbbl avatar Jun 26 '25 12:06 llbbl