bert_language_understanding icon indicating copy to clipboard operation
bert_language_understanding copied to clipboard

feat: Add comprehensive Python testing infrastructure with Poetry

Open llbbl opened this issue 6 months ago • 0 comments

Add Python Testing Infrastructure

Summary

This PR sets up a comprehensive testing infrastructure for the BERT NLP project using Poetry as the package manager and pytest as the testing framework. The infrastructure provides everything needed to start writing and running tests immediately.

Changes Made

Package Management

  • ✅ Set up Poetry as the project's package manager
  • ✅ Created pyproject.toml with complete project configuration
  • ✅ Added testing dependencies as development dependencies:
    • pytest (^7.4.3) - Core testing framework
    • pytest-cov (^4.1.0) - Coverage reporting
    • pytest-mock (^3.12.0) - Mocking utilities

Testing Configuration

  • ✅ Configured pytest in pyproject.toml with:
    • Custom markers: unit, integration, slow
    • Coverage reporting (HTML, XML, terminal)
    • Coverage threshold set to 0% (to be increased to 80% when tests are added)
    • Strict mode and comprehensive test discovery patterns

Directory Structure

tests/
├── __init__.py
├── conftest.py          # Shared fixtures
├── test_infrastructure_validation.py
├── unit/
│   └── __init__.py
└── integration/
    └── __init__.py

Testing Fixtures (conftest.py)

Created reusable fixtures for common testing needs:

  • temp_dir - Temporary directory management
  • mock_config - Mock configuration objects
  • mock_model - Mock model for testing
  • sample_text_data - Sample text data
  • sample_tokenized_data - Sample tokenized data
  • mock_data_loader - Mock data loader
  • capture_stdout - Stdout capture for testing print statements
  • reset_random_seeds - Automatic random seed reset for reproducibility

Additional Setup

  • ✅ Created .gitignore with comprehensive Python/testing patterns
  • ✅ Updated CLAUDE.md with testing commands and project structure
  • ✅ Created validation tests to verify the infrastructure works

How to Use

Install Dependencies

poetry install

Run Tests

# Run all tests
poetry run test
# or
poetry run tests

# Run with verbose output
poetry run pytest -v

# Run only unit tests
poetry run pytest tests/unit

# Run only integration tests
poetry run pytest tests/integration

# Run tests with specific marker
poetry run pytest -m unit

# Run with coverage report
poetry run pytest --cov

Writing Tests

  1. Place unit tests in tests/unit/
  2. Place integration tests in tests/integration/
  3. Use the fixtures from conftest.py for common testing needs
  4. Mark tests appropriately with @pytest.mark.unit, @pytest.mark.integration, or @pytest.mark.slow

Notes

  • Coverage threshold is currently set to 0% to allow initial setup. This should be increased to 80% once actual tests are written
  • The infrastructure validation tests confirm that all components are working correctly
  • Poetry lock file (poetry.lock) is not gitignored and should be committed to ensure reproducible builds
  • All test commands are available through Poetry scripts for consistency

Next Steps

  1. Start writing unit tests for existing modules
  2. Add integration tests for end-to-end workflows
  3. Increase coverage threshold to 80% once sufficient tests are written
  4. Consider adding additional testing tools as needed (e.g., hypothesis for property-based testing)

llbbl avatar Jun 23 '25 01:06 llbbl