feat: Add comprehensive Python testing infrastructure with Poetry

Open llbbl opened this issue 6 months ago • 0 comments

Add Python Testing Infrastructure

Summary

This PR sets up a comprehensive testing infrastructure for the BERT NLP project using Poetry as the package manager and pytest as the testing framework. The infrastructure provides everything needed to start writing and running tests immediately.

Changes Made

Package Management

✅ Set up Poetry as the project's package manager
✅ Created pyproject.toml with complete project configuration
✅ Added testing dependencies as development dependencies:
- pytest (^7.4.3) - Core testing framework
- pytest-cov (^4.1.0) - Coverage reporting
- pytest-mock (^3.12.0) - Mocking utilities

Testing Configuration

✅ Configured pytest in pyproject.toml with:
- Custom markers: unit, integration, slow
- Coverage reporting (HTML, XML, terminal)
- Coverage threshold set to 0% (to be increased to 80% when tests are added)
- Strict mode and comprehensive test discovery patterns

Directory Structure

tests/
├── __init__.py
├── conftest.py          # Shared fixtures
├── test_infrastructure_validation.py
├── unit/
│   └── __init__.py
└── integration/
    └── __init__.py

Testing Fixtures (conftest.py)

Created reusable fixtures for common testing needs:

temp_dir - Temporary directory management
mock_config - Mock configuration objects
mock_model - Mock model for testing
sample_text_data - Sample text data
sample_tokenized_data - Sample tokenized data
mock_data_loader - Mock data loader
capture_stdout - Stdout capture for testing print statements
reset_random_seeds - Automatic random seed reset for reproducibility

Additional Setup

✅ Created .gitignore with comprehensive Python/testing patterns
✅ Updated CLAUDE.md with testing commands and project structure
✅ Created validation tests to verify the infrastructure works

How to Use

Install Dependencies

poetry install

Run Tests

# Run all tests
poetry run test
# or
poetry run tests

# Run with verbose output
poetry run pytest -v

# Run only unit tests
poetry run pytest tests/unit

# Run only integration tests
poetry run pytest tests/integration

# Run tests with specific marker
poetry run pytest -m unit

# Run with coverage report
poetry run pytest --cov

Writing Tests

Place unit tests in tests/unit/
Place integration tests in tests/integration/
Use the fixtures from conftest.py for common testing needs
Mark tests appropriately with @pytest.mark.unit, @pytest.mark.integration, or @pytest.mark.slow

Notes

Coverage threshold is currently set to 0% to allow initial setup. This should be increased to 80% once actual tests are written
The infrastructure validation tests confirm that all components are working correctly
Poetry lock file (poetry.lock) is not gitignored and should be committed to ensure reproducible builds
All test commands are available through Poetry scripts for consistency

Next Steps

Start writing unit tests for existing modules
Add integration tests for end-to-end workflows
Increase coverage threshold to 80% once sufficient tests are written
Consider adding additional testing tools as needed (e.g., hypothesis for property-based testing)

Jun 23 '25 01:06 llbbl