eraserbenchmark
eraserbenchmark copied to clipboard
Set up comprehensive Python testing infrastructure with Poetry
Python Testing Infrastructure Setup
Summary
This PR sets up a complete testing infrastructure for the rationale benchmark Python project using Poetry as the package manager and pytest as the testing framework.
Changes Made
Package Management
-
Poetry Configuration: Created
pyproject.tomlwith Poetry as the package manager -
Dependency Migration: Migrated all dependencies from
requirements.txtto Poetry format -
Development Dependencies: Added testing tools as development dependencies:
-
pytest- Main testing framework -
pytest-cov- Coverage reporting -
pytest-mock- Mocking utilities
-
Testing Configuration
-
pytest Settings: Configured in
pyproject.tomlwith:- Test discovery patterns for
test_*.pyand*_test.pyfiles - Coverage thresholds set to 80%
- Multiple coverage report formats (terminal, HTML, XML)
- Custom markers for test categorization (
unit,integration,slow) - Strict mode and verbose output enabled
- Test discovery patterns for
-
Coverage Settings: Comprehensive coverage configuration with:
- Source directory targeting
rationale_benchmark - Exclusion patterns for test files and virtual environments
- Branch coverage enabled
- Detailed reporting with missing line numbers
- Source directory targeting
Directory Structure
tests/
├── __init__.py
├── conftest.py # Shared fixtures
├── test_setup_validation.py # Validation tests
├── unit/
│ └── __init__.py
└── integration/
└── __init__.py
Shared Fixtures (conftest.py)
Created comprehensive fixtures for common testing needs:
-
temp_dir- Temporary directory management -
temp_file- Temporary file creation -
mock_config- Configuration mocking -
sample_json_config- JSON config file generation -
sample_data- Test data samples -
mock_model- Model mocking -
mock_tokenizer- Tokenizer mocking -
environment_vars- Environment variable management -
capture_logs- Log capturing -
reset_random_seeds- Automatic seed resetting for reproducibility -
mock_file_system- File system structure mocking -
sample_predictions- Prediction result samples
Additional Setup
-
Updated .gitignore: Added entries for:
- Testing artifacts (
.pytest_cache/,coverage.xml,htmlcov/) - Claude settings (
.claude/*) - IDE files (
.vscode/,.idea/, etc.) - OS files (
.DS_Store,Thumbs.db)
- Testing artifacts (
-
Validation Tests: Created comprehensive validation tests to verify:
- Project structure integrity
- Package importability
- Testing directory structure
- Fixture availability and functionality
- Marker configuration
- Coverage configuration
- Mocking capabilities
How to Use
Installation
# Install Poetry (if not already installed)
curl -sSL https://install.python-poetry.org | python3 -
# Install dependencies
poetry install
Running Tests
# Run all tests
poetry run pytest
# Run with coverage
poetry run pytest --cov=rationale_benchmark
# Run specific test categories
poetry run pytest -m unit # Unit tests only
poetry run pytest -m integration # Integration tests only
poetry run pytest -m "not slow" # Exclude slow tests
# Run with different verbosity levels
poetry run pytest -v # Verbose
poetry run pytest -q # Quiet
# Generate coverage reports
poetry run pytest --cov=rationale_benchmark --cov-report=html
# Coverage report will be in htmlcov/index.html
Writing Tests
- Place unit tests in
tests/unit/ - Place integration tests in
tests/integration/ - Use the provided fixtures from
conftest.py - Mark tests appropriately:
@pytest.mark.unit def test_my_unit_test(): pass @pytest.mark.integration def test_my_integration_test(): pass @pytest.mark.slow def test_my_slow_test(): pass
Testing Infrastructure Validation
The setup includes validation tests (test_setup_validation.py) that verify:
- All testing components are properly configured
- Fixtures work as expected
- Markers are available
- Coverage reporting functions correctly
All 19 validation tests pass successfully, confirming the infrastructure is ready for use.
Notes
- The infrastructure is configured but does not include actual unit tests for the codebase
- Heavy ML dependencies (tensorflow, torch, etc.) are included in pyproject.toml but may require separate installation
- Coverage threshold is set to 80% but can be adjusted in
pyproject.toml - The Poetry lock file (
poetry.lock) should be committed to ensure reproducible builds