RecSysDatasets
RecSysDatasets copied to clipboard
feat: Add comprehensive Python testing infrastructure with Poetry
Add Python Testing Infrastructure
Summary
This PR sets up a comprehensive testing infrastructure for the Python project using Poetry as the package manager and pytest as the testing framework. The setup provides a ready-to-use testing environment where developers can immediately start writing tests.
Changes Made
Package Management
- Poetry Setup: Created
pyproject.tomlwith Poetry configuration - Dependency Migration: Migrated existing dependencies from
requirements.txt - Development Dependencies: Added testing tools as dev dependencies
Testing Framework
- pytest: Main testing framework with comprehensive configuration
- pytest-cov: Coverage reporting with HTML and XML output
- pytest-mock: Mocking utilities for unit tests
Configuration
- Test Discovery: Configured patterns for finding tests (
test_*.py,*_test.py) - Coverage Settings:
- 80% coverage threshold
- HTML reports in
htmlcov/ - XML reports for CI integration
- Branch coverage enabled
- Custom Markers: Added
unit,integration, andslowtest markers
Directory Structure
tests/
├── __init__.py
├── conftest.py # Shared fixtures and configuration
├── unit/
│ └── __init__.py
├── integration/
│ └── __init__.py
└── test_setup_validation.py # Infrastructure validation tests
Shared Fixtures (conftest.py)
temp_dir: Temporary directory for test filessample_dataframe: Sample pandas DataFramesample_numpy_array: Sample numpy arraymock_config: Mock configuration dictionarysample_csv_file: Creates test CSV filessample_json_file: Creates test JSON filesmock_dataset_files: Creates mock dataset files (.inter, .user, .item)reset_environment: Resets environment variablescapture_logs: Captures log messages during tests
Development Commands
poetry run test- Run all tests with coveragepoetry run tests- Alternative command (both work)- Standard pytest options are available (e.g.,
-v,-k,--markers)
Additional Setup
- Created comprehensive
.gitignorewith Python, testing, and Claude entries - Added validation tests to verify the infrastructure works correctly
- All tests pass successfully (14 validation tests)
How to Use
-
Install dependencies:
poetry install -
Run tests:
poetry run test # or poetry run tests -
Run specific test types:
poetry run pytest -m unit # Run only unit tests poetry run pytest -m integration # Run only integration tests poetry run pytest -m "not slow" # Skip slow tests -
View coverage report:
- HTML report: Open
htmlcov/index.htmlin a browser - Terminal report: Included in test output
- XML report:
coverage.xmlfor CI integration
- HTML report: Open
Notes
- The coverage threshold is set to 80% but will initially fail since no actual tests for the codebase exist yet
- The infrastructure is ready for immediate test development
- All pytest standard features are available
- Poetry lock file is created and should be committed to ensure reproducible builds
Next Steps
Developers can now start writing unit and integration tests for the conversion tools modules. The infrastructure provides all necessary tools and configurations for comprehensive testing.