feat: Set up comprehensive Python testing infrastructure with Poetry

Open llbbl opened this issue 5 months ago • 0 comments

Set Up Python Testing Infrastructure for MELD

Summary

This PR establishes a comprehensive testing infrastructure for the MELD (Multimodal EmotionLines Dataset) project using Poetry as the package manager and pytest as the testing framework.

Changes Made

Package Management

Poetry Configuration: Added pyproject.toml with Poetry configuration for dependency management
Package Structure: Configured packages to include baseline and utils modules
Development Dependencies: Added pytest (7.4.0+), pytest-cov (4.1.0+), and pytest-mock (3.11.1+) as development dependencies

Testing Infrastructure

Directory Structure: Created organized test directories:
- tests/ - Root testing directory
- tests/unit/ - For unit tests
- tests/integration/ - For integration tests
- All directories include __init__.py files for proper Python package recognition

Test Configuration

pytest Configuration: Comprehensive pytest settings in pyproject.toml:
- Test discovery patterns for flexible naming conventions
- Coverage reporting with HTML and XML output formats
- Strict markers and configuration
- Custom test markers: unit, integration, slow
- Verbose output with clear failure reporting
Coverage Configuration:
- Source tracking for baseline and utils modules
- Exclusion patterns for test files and common Python artifacts
- HTML coverage reports in htmlcov/ directory
- XML coverage report for CI integration
- Coverage threshold set to 0% initially (should be increased as tests are added)

Shared Test Fixtures

Created comprehensive fixtures in conftest.py:

temp_dir: Temporary directory for test file operations
mock_config: Configuration dictionary for testing
sample_dialogue_data: Sample MELD dialogue structure
emotion_labels: List of 7 MELD emotions
sentiment_labels: List of 3 MELD sentiments
mock_csv_data: Creates mock CSV files for data loading tests
mock_pickle_data: Mock data structure for pickle file testing
data_path: Creates mock data directory structure
reset_environment: Ensures clean environment for each test
capture_stdout: Captures print statements for testing

Validation Tests

Created test_setup_validation.py to verify the infrastructure:

Confirms pytest installation
Validates project structure
Tests all fixture availability
Verifies test markers work correctly
Ensures coverage reporting is configured

Build Configuration

Git Ignore: Comprehensive .gitignore file including:
- Python artifacts (__pycache__, *.pyc, etc.)
- Testing artifacts (.pytest_cache/, coverage.xml, htmlcov/)
- Virtual environments
- IDE files
- Claude-specific directories (.claude/*)
- Build and distribution files

Poetry Scripts

Configured convenient test commands:

poetry run test - Run all tests with coverage
poetry run tests - Alternative command (both work identically)

How to Use

Install Dependencies:
```
poetry install
```
Run Tests:
```
poetry run test
# or
poetry run tests
```

Run Specific Test Types:

# Run only unit tests
poetry run pytest -m unit

# Run only integration tests
poetry run pytest -m integration

# Exclude slow tests
poetry run pytest -m "not slow"

View Coverage Reports:
- HTML Report: Open htmlcov/index.html in a browser
- Terminal Report: Automatically displayed after test runs
- XML Report: Available at coverage.xml for CI tools

Notes

Coverage threshold is currently set to 0% to allow initial setup. This should be increased to 80% (or appropriate level) as actual tests are added
The infrastructure is ready for immediate test development
All pytest standard options are available through the Poetry commands
The validation tests confirm that the setup is working correctly

Next Steps

Developers can now:

Write unit tests in tests/unit/
Write integration tests in tests/integration/
Use the provided fixtures for common testing scenarios
Gradually increase coverage thresholds as tests are added

Jun 27 '25 23:06 llbbl