feat: Add comprehensive Python testing infrastructure with Poetry

Open llbbl opened this issue 4 months ago • 0 comments

Add Python Testing Infrastructure

Summary

This PR establishes a comprehensive testing infrastructure for the soweego Python project using modern Python testing tools and best practices. The setup provides a solid foundation for writing and maintaining tests going forward.

Changes Made

Package Management

Poetry Configuration: Created pyproject.toml with Poetry as the package manager
Dependency Migration: Migrated all project dependencies from scattered imports to centralized Poetry management
Development Dependencies: Added pytest, pytest-cov, and pytest-mock as development dependencies

Testing Framework Setup

Pytest Configuration:
- Configured test discovery patterns for flexible test file naming
- Set up coverage reporting with HTML and XML output formats
- Added custom markers for test categorization (unit, integration, slow)
- Configured strict mode for better test quality
Coverage Configuration:
- Set 80% coverage threshold (currently commented out until actual tests are written)
- Configured exclusion patterns for non-testable code
- Set up both terminal and HTML coverage reports

Directory Structure

tests/
├── __init__.py
├── conftest.py                      # Shared fixtures and configuration
├── test_infrastructure_validation.py # Validation tests
├── unit/
│   └── __init__.py
└── integration/
    └── __init__.py

Test Fixtures (conftest.py)

Created comprehensive fixtures for common testing needs:

temp_dir / temp_file: Temporary file system resources
mock_config: Configuration dictionary for testing
mock_database_session: SQLAlchemy session mock
mock_http_client: HTTP client mock for API testing
sample_entity_data: Wikidata entity test data
cli_runner: Click CLI testing runner
mock_wikidata_api: Wikidata API mock
sample_csv_data / sample_json_data: Test data files
isolated_filesystem: Isolated test environment
mock_logger: Logger mock for testing logging behavior
mock_sparql_results: SPARQL query result mocks

Additional Configuration

Makefile: Added convenience commands for running tests
.gitignore Updates: Added entries for testing artifacts, Poetry lock file, and Claude settings

How to Use

Installation

# Install Poetry if not already installed
curl -sSL https://install.python-poetry.org | python3 -

# Install dependencies
poetry install

# Install only development dependencies
poetry install --only dev

Running Tests

# Run all tests
poetry run pytest

# Run with coverage report
poetry run pytest --cov-report=term-missing

# Run specific test markers
poetry run pytest -m unit
poetry run pytest -m integration
poetry run pytest -m "not slow"

# Using Makefile commands
make test     # Run all tests
make tests    # Alternative command
make coverage # Run with detailed coverage report
make clean    # Clean test artifacts

Writing New Tests

Place unit tests in tests/unit/
Place integration tests in tests/integration/
Use the fixtures from conftest.py for common test needs
Mark tests appropriately with @pytest.mark.unit, @pytest.mark.integration, or @pytest.mark.slow

Validation

All infrastructure components have been validated with 18 passing tests that verify:

All testing packages are properly installed
Directory structure is correctly set up
All fixtures are working as expected
Pytest configuration is properly loaded
Custom markers are registered
Coverage configuration is in place

Notes

The 80% coverage requirement is currently disabled (commented out) to allow the infrastructure to be merged without actual application tests
To enable coverage requirements, uncomment the fail_under = 80 lines in pyproject.toml
The Poetry lock file is intentionally included in .gitignore to avoid dependency conflicts
All test fixtures are designed to be extensible and can be enhanced as needed

Next Steps

Begin writing unit tests for individual modules
Add integration tests for key workflows
Set up continuous integration to run tests automatically
Enable coverage requirements once sufficient tests are written
Consider adding additional testing tools (e.g., hypothesis for property-based testing, pytest-asyncio for async code)

Aug 23 '25 19:08 llbbl