eraserbenchmark icon indicating copy to clipboard operation
eraserbenchmark copied to clipboard

Set up comprehensive Python testing infrastructure with Poetry

Open llbbl opened this issue 6 months ago • 0 comments

Python Testing Infrastructure Setup

Summary

This PR sets up a complete testing infrastructure for the rationale benchmark Python project using Poetry as the package manager and pytest as the testing framework.

Changes Made

Package Management

  • Poetry Configuration: Created pyproject.toml with Poetry as the package manager
  • Dependency Migration: Migrated all dependencies from requirements.txt to Poetry format
  • Development Dependencies: Added testing tools as development dependencies:
    • pytest - Main testing framework
    • pytest-cov - Coverage reporting
    • pytest-mock - Mocking utilities

Testing Configuration

  • pytest Settings: Configured in pyproject.toml with:

    • Test discovery patterns for test_*.py and *_test.py files
    • Coverage thresholds set to 80%
    • Multiple coverage report formats (terminal, HTML, XML)
    • Custom markers for test categorization (unit, integration, slow)
    • Strict mode and verbose output enabled
  • Coverage Settings: Comprehensive coverage configuration with:

    • Source directory targeting rationale_benchmark
    • Exclusion patterns for test files and virtual environments
    • Branch coverage enabled
    • Detailed reporting with missing line numbers

Directory Structure

tests/
├── __init__.py
├── conftest.py           # Shared fixtures
├── test_setup_validation.py  # Validation tests
├── unit/
│   └── __init__.py
└── integration/
    └── __init__.py

Shared Fixtures (conftest.py)

Created comprehensive fixtures for common testing needs:

  • temp_dir - Temporary directory management
  • temp_file - Temporary file creation
  • mock_config - Configuration mocking
  • sample_json_config - JSON config file generation
  • sample_data - Test data samples
  • mock_model - Model mocking
  • mock_tokenizer - Tokenizer mocking
  • environment_vars - Environment variable management
  • capture_logs - Log capturing
  • reset_random_seeds - Automatic seed resetting for reproducibility
  • mock_file_system - File system structure mocking
  • sample_predictions - Prediction result samples

Additional Setup

  • Updated .gitignore: Added entries for:

    • Testing artifacts (.pytest_cache/, coverage.xml, htmlcov/)
    • Claude settings (.claude/*)
    • IDE files (.vscode/, .idea/, etc.)
    • OS files (.DS_Store, Thumbs.db)
  • Validation Tests: Created comprehensive validation tests to verify:

    • Project structure integrity
    • Package importability
    • Testing directory structure
    • Fixture availability and functionality
    • Marker configuration
    • Coverage configuration
    • Mocking capabilities

How to Use

Installation

# Install Poetry (if not already installed)
curl -sSL https://install.python-poetry.org | python3 -

# Install dependencies
poetry install

Running Tests

# Run all tests
poetry run pytest

# Run with coverage
poetry run pytest --cov=rationale_benchmark

# Run specific test categories
poetry run pytest -m unit        # Unit tests only
poetry run pytest -m integration # Integration tests only
poetry run pytest -m "not slow"  # Exclude slow tests

# Run with different verbosity levels
poetry run pytest -v   # Verbose
poetry run pytest -q   # Quiet

# Generate coverage reports
poetry run pytest --cov=rationale_benchmark --cov-report=html
# Coverage report will be in htmlcov/index.html

Writing Tests

  1. Place unit tests in tests/unit/
  2. Place integration tests in tests/integration/
  3. Use the provided fixtures from conftest.py
  4. Mark tests appropriately:
    @pytest.mark.unit
    def test_my_unit_test():
        pass
    
    @pytest.mark.integration
    def test_my_integration_test():
        pass
    
    @pytest.mark.slow
    def test_my_slow_test():
        pass
    

Testing Infrastructure Validation

The setup includes validation tests (test_setup_validation.py) that verify:

  • All testing components are properly configured
  • Fixtures work as expected
  • Markers are available
  • Coverage reporting functions correctly

All 19 validation tests pass successfully, confirming the infrastructure is ready for use.

Notes

  • The infrastructure is configured but does not include actual unit tests for the codebase
  • Heavy ML dependencies (tensorflow, torch, etc.) are included in pyproject.toml but may require separate installation
  • Coverage threshold is set to 80% but can be adjusted in pyproject.toml
  • The Poetry lock file (poetry.lock) should be committed to ensure reproducible builds

llbbl avatar Aug 23 '25 19:08 llbbl