feat: Set up Python testing infrastructure with Poetry and pytest
Add Python Testing Infrastructure
Summary
This PR sets up a comprehensive testing infrastructure for the pointer-summarizer project using Poetry as the package manager and pytest as the testing framework. The setup provides a solid foundation for writing and running tests with proper coverage reporting and organization.
Changes Made
Package Management
- Poetry Configuration: Created
pyproject.tomlwith Poetry package management configuration - Dependencies: Added testing dependencies as development dependencies:
pytest(^7.4.0) - Core testing frameworkpytest-cov(^4.1.0) - Coverage reporting pluginpytest-mock(^3.11.0) - Mocking utilities
Testing Configuration
-
pytest Settings:
- Configured test discovery patterns for
test_*.pyand*_test.pyfiles - Set up custom markers:
unit,integration, andslow - Enabled strict marker enforcement
- Configured verbose output with short traceback format
- Configured test discovery patterns for
-
Coverage Settings:
- Source directories:
data_utilandtraining_ptr_gen - Coverage reports: HTML, XML, and terminal output
- Excluded Python 2 syntax file (
data_util/data.py) temporarily - Coverage threshold set to 0% initially (to be increased as tests are added)
- Source directories:
Directory Structure
tests/
├── __init__.py
├── conftest.py # Shared pytest fixtures
├── test_setup_validation.py # Validation tests
├── unit/
│ └── __init__.py
└── integration/
└── __init__.py
Shared Fixtures (conftest.py)
Created comprehensive fixtures for common testing needs:
temp_dir- Temporary directory managementmock_config- Mock configuration dictionarysample_vocab- Mock vocabulary objectsample_batch_data- Sample batch data for model testingdevice- PyTorch device detectionreset_random_seeds- Reproducible test runsmock_data_path,mock_model_path,mock_log_path- Mock directory structures
Development Workflow
- Updated
.gitignorewith testing artifacts, virtual environments, and IDE files - Note:
poetry.lockis NOT gitignored to ensure reproducible builds
How to Use
Installation
# Install Poetry if not already installed
curl -sSL https://install.python-poetry.org | python3 -
# Install project dependencies
poetry install --with dev
Running Tests
Both commands work identically:
poetry run test
# or
poetry run tests
Test Options
All standard pytest options are available:
# Run only unit tests
poetry run test -m unit
# Run with specific verbosity
poetry run test -v
# Run specific test file
poetry run test tests/test_setup_validation.py
# Run without coverage
poetry run test --no-cov
Coverage Reports
After running tests, coverage reports are available:
- HTML Report:
htmlcov/index.html - XML Report:
coverage.xml - Terminal Report: Displayed after test run
Notes
-
Python Version: The project appears to have some Python 2 syntax (e.g.,
data_util/data.py). These files are temporarily excluded from coverage until migrated. -
pyrouge Dependency: The original
pyrougepackage is not available on PyPI. It needs to be installed separately following its specific installation instructions. -
Coverage Threshold: Currently set to 0% to allow the infrastructure setup to complete. This should be increased (e.g., to 80%) as actual tests are added.
-
Validation Tests: The PR includes validation tests that verify the testing infrastructure is working correctly. These are not unit tests for the actual codebase.
Next Steps
With this testing infrastructure in place, developers can now:
- Write unit tests for individual modules
- Add integration tests for end-to-end workflows
- Gradually increase the coverage threshold
- Consider migrating Python 2 code to Python 3 for full coverage