HNRE icon indicating copy to clipboard operation
HNRE copied to clipboard

feat: Set up complete Python testing infrastructure with Poetry

Open llbbl opened this issue 4 months ago • 0 comments

Set Up Python Testing Infrastructure

Summary

This PR establishes a complete testing infrastructure for the Neural Relation Extraction project using Poetry as the package manager and pytest as the testing framework. The setup provides a solid foundation for writing unit and integration tests with comprehensive coverage reporting.

Changes Made

Package Management

  • Poetry Configuration: Created pyproject.toml with Poetry as the package manager
  • Dependencies: Migrated project dependencies (tensorflow, numpy, scikit-learn, matplotlib) to Poetry
  • Development Dependencies: Added pytest, pytest-cov, and pytest-mock as dev dependencies

Testing Configuration

  • pytest Settings: Configured test discovery, coverage thresholds (80%), and output formatting
  • Coverage Reports: Set up HTML and XML coverage reporting with branch coverage
  • Custom Markers: Added unit, integration, and slow markers for test categorization

Directory Structure

tests/
├── __init__.py
├── conftest.py           # Shared fixtures and configuration
├── test_validation.py    # Infrastructure validation tests
├── unit/
│   └── __init__.py
└── integration/
    └── __init__.py

Test Fixtures (conftest.py)

  • temp_dir: Temporary directory for test files
  • sample_data_dir: Mock data directory structure
  • mock_config: Configuration dictionary for testing
  • sample_json_config: JSON configuration file fixture
  • mock_tensorflow_session: Mock TensorFlow session
  • sample_numpy_data: Sample numpy arrays
  • mock_model: Mock model object
  • sample_text_data: Sample text data
  • mock_file_system: Mock file system operations
  • capture_stdout: Stdout capture for testing print statements
  • mock_time: Time mocking utilities
  • And more...

Development Support

  • Script Commands: Both poetry run test and poetry run tests work for running tests
  • Git Ignore: Updated .gitignore with testing artifacts and Poetry files
  • Validation Tests: Created comprehensive validation tests to verify the infrastructure

How to Use

Installation

# Install all dependencies including dev dependencies
poetry install

Running Tests

# Run all tests
poetry run test

# Alternative command (same as above)
poetry run tests

# Run only unit tests
poetry run pytest -m unit

# Run only integration tests
poetry run pytest -m integration

# Run tests excluding slow ones
poetry run pytest -m "not slow"

# Run with specific verbosity
poetry run pytest -v

# Run specific test file
poetry run pytest tests/test_validation.py

# Run with coverage report
poetry run pytest --cov-report=html

Writing Tests

  1. Place unit tests in tests/unit/
  2. Place integration tests in tests/integration/
  3. Use fixtures from conftest.py for common test needs
  4. Mark tests appropriately with @pytest.mark.unit, @pytest.mark.integration, or @pytest.mark.slow

Coverage Reports

  • Terminal: Coverage shown automatically after test runs
  • HTML Report: View detailed coverage at htmlcov/index.html
  • XML Report: Available at coverage.xml for CI/CD integration

Notes

  • The coverage threshold is set to 80% for both line and branch coverage
  • The testing infrastructure is ready for immediate use - developers can start writing tests right away
  • All pytest standard options are available through the Poetry commands
  • The poetry.lock file should be committed to ensure reproducible builds across environments

Next Steps

With this testing infrastructure in place, the team can now:

  1. Write unit tests for individual components (model, utils, scripts)
  2. Create integration tests for end-to-end workflows
  3. Add performance benchmarks using the slow marker
  4. Integrate with CI/CD pipelines using the XML coverage reports

llbbl avatar Aug 23 '25 15:08 llbbl